ThibaultLSDC commited on
Commit
fbc1946
·
1 Parent(s): 9164479
Files changed (48) hide show
  1. app.py +9 -0
  2. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/exp_args.pkl +0 -0
  3. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/experiment.log +138 -0
  4. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/goal_object.pkl.gz +0 -0
  5. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/package_versions.txt +231 -0
  6. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/screenshot_step_0.png +0 -0
  7. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/screenshot_step_1.png +0 -0
  8. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/screenshot_step_2.png +0 -0
  9. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/screenshot_step_3.png +0 -0
  10. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/step_0.pkl.gz +0 -0
  11. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/step_1.pkl.gz +0 -0
  12. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/step_2.pkl.gz +0 -0
  13. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/step_3.pkl.gz +0 -0
  14. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/summary_info.json +44 -0
  15. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/exp_args.pkl +0 -0
  16. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/experiment.log +98 -0
  17. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/goal_object.pkl.gz +0 -0
  18. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/package_versions.txt +231 -0
  19. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/screenshot_step_0.png +0 -0
  20. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/screenshot_step_1.png +0 -0
  21. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/screenshot_step_2.png +0 -0
  22. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/step_0.pkl.gz +0 -0
  23. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/step_1.pkl.gz +0 -0
  24. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/step_2.pkl.gz +0 -0
  25. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/summary_info.json +44 -0
  26. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/exp_args.pkl +0 -0
  27. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/experiment.log +58 -0
  28. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/goal_object.pkl.gz +0 -0
  29. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/package_versions.txt +231 -0
  30. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/screenshot_step_0.png +0 -0
  31. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/screenshot_step_1.png +0 -0
  32. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/step_0.pkl.gz +0 -0
  33. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/step_1.pkl.gz +0 -0
  34. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/summary_info.json +44 -0
  35. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/exp_args.pkl +0 -0
  36. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/experiment.log +58 -0
  37. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/goal_object.pkl.gz +0 -0
  38. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/package_versions.txt +231 -0
  39. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/screenshot_step_0.png +0 -0
  40. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/screenshot_step_1.png +0 -0
  41. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/step_0.pkl.gz +0 -0
  42. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/step_1.pkl.gz +0 -0
  43. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/summary_info.json +44 -0
  44. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/error_report_trial_1_of_3.md +0 -0
  45. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/result_df_trial_1_of_3.csv +5 -0
  46. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/study.pkl.gz +0 -0
  47. data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/summary_df_trial_1_of_3.csv +2 -0
  48. requirements.txt +1 -0
app.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+
3
+ os.environ["AGENTLAB_EXP_ROOT"] = os.path.abspath(
4
+ os.path.join(os.path.dirname(__file__), "data")
5
+ )
6
+
7
+ from agentlab.analyze.agent_xray import run_gradio
8
+
9
+ run_gradio()
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/exp_args.pkl ADDED
Binary file (2.31 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/experiment.log ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-01-21 11:42:54,795 - 93650 - browsergym.experiments.loop - INFO - Running experiment GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20 in:
2
+ /Users/t.lesellierdechezell/agentlab_results/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20
3
+ 2025-01-21 11:42:54,875 - 93650 - browsergym.experiments.loop - DEBUG - Agent created.
4
+ 2025-01-21 11:42:54,883 - 93650 - browsergym.experiments.loop - DEBUG - Environment created.
5
+ 2025-01-21 11:42:54,883 - 93650 - asyncio - DEBUG - Using selector: KqueueSelector
6
+ 2025-01-21 11:42:58,144 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='about:blank'>
7
+ 2025-01-21 11:42:58,208 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
8
+ 2025-01-21 11:42:58,228 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
9
+ 2025-01-21 11:42:58,243 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
10
+ 2025-01-21 11:42:58,455 - 93650 - browsergym.core.observation - DEBUG - Marking frame ''
11
+ 2025-01-21 11:42:59,017 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
12
+ 2025-01-21 11:42:59,017 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
13
+ 2025-01-21 11:42:59,017 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 5720
14
+ 2025-01-21 11:42:59,036 - 93650 - browsergym.experiments.loop - DEBUG - Environment reset.
15
+ 2025-01-21 11:42:59,036 - 93650 - browsergym.experiments.loop - DEBUG - Starting step 0.
16
+ 2025-01-21 11:42:59,835 - 93650 - httpcore.connection - DEBUG - connect_tcp.started host='ui-assist.openai.azure.com' port=443 local_address=None timeout=5.0 socket_options=None
17
+ 2025-01-21 11:43:00,130 - 93650 - httpcore.connection - DEBUG - connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x1a4300650>
18
+ 2025-01-21 11:43:00,130 - 93650 - httpcore.connection - DEBUG - start_tls.started ssl_context=<ssl.SSLContext object at 0x1a3ea07a0> server_hostname='ui-assist.openai.azure.com' timeout=5.0
19
+ 2025-01-21 11:43:00,675 - 93650 - httpcore.connection - DEBUG - start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x1a4db7cd0>
20
+ 2025-01-21 11:43:00,676 - 93650 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
21
+ 2025-01-21 11:43:00,676 - 93650 - httpcore.http11 - DEBUG - send_request_headers.complete
22
+ 2025-01-21 11:43:00,676 - 93650 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
23
+ 2025-01-21 11:43:00,676 - 93650 - httpcore.http11 - DEBUG - send_request_body.complete
24
+ 2025-01-21 11:43:00,676 - 93650 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
25
+ 2025-01-21 11:43:02,558 - 93650 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Length', b'1542'), (b'Content-Type', b'application/json'), (b'apim-request-id', b'f02d1c18-f499-44d1-ae87-8d0b82cb086a'), (b'Strict-Transport-Security', b'max-age=31536000; includeSubDomains; preload'), (b'x-content-type-options', b'nosniff'), (b'x-accel-buffering', b'no'), (b'x-ms-rai-invoked', b'true'), (b'x-request-id', b'f4dccbe6-35c9-453d-b9e8-1d585404974a'), (b'x-ms-region', b'East US'), (b'x-ratelimit-remaining-requests', b'118369'), (b'x-ratelimit-remaining-tokens', b'11819217'), (b'azureml-model-session', b'd015-20250107143625'), (b'x-envoy-upstream-service-time', b'1597'), (b'x-ms-client-request-id', b'f02d1c18-f499-44d1-ae87-8d0b82cb086a'), (b'Date', b'Tue, 21 Jan 2025 16:43:02 GMT')])
26
+ 2025-01-21 11:43:02,559 - 93650 - httpx - INFO - HTTP Request: POST https://ui-assist.openai.azure.com/openai/deployments/gpt-4o-mini-2024-07-18/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
27
+ 2025-01-21 11:43:02,559 - 93650 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
28
+ 2025-01-21 11:43:02,559 - 93650 - httpcore.http11 - DEBUG - receive_response_body.complete
29
+ 2025-01-21 11:43:02,559 - 93650 - httpcore.http11 - DEBUG - response_closed.started
30
+ 2025-01-21 11:43:02,559 - 93650 - httpcore.http11 - DEBUG - response_closed.complete
31
+ 2025-01-21 11:43:02,759 - 93650 - browsergym.experiments.loop - DEBUG - Agent chose action:
32
+ click('18')
33
+ 2025-01-21 11:43:02,767 - 93650 - browsergym.experiments.loop - DEBUG - Step info saved.
34
+ 2025-01-21 11:43:02,767 - 93650 - browsergym.experiments.loop - INFO - I need to select the checkboxes for "SQkB" and "ug" before clicking the "Submit" button. The checkboxes are currently unchecked. I will first click the checkbox for "SQkB" using its bid '18', and then I will click the checkbox for "ug" using its bid '30'.
35
+
36
+ action:
37
+ click('18')
38
+
39
+ 2025-01-21 11:43:02,771 - 93650 - browsergym.experiments.loop - DEBUG - Chat info sent.
40
+ 2025-01-21 11:43:02,771 - 93650 - browsergym.experiments.loop - DEBUG - Sending action to environment.
41
+ 2025-01-21 11:43:02,771 - 93650 - browsergym.core.env - DEBUG - Executing action
42
+ 2025-01-21 11:43:02,849 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
43
+ 2025-01-21 11:43:02,850 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
44
+ 2025-01-21 11:43:02,851 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
45
+ 2025-01-21 11:43:02,852 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
46
+ 2025-01-21 11:43:02,869 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
47
+ 2025-01-21 11:43:02,871 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
48
+ 2025-01-21 11:43:02,973 - 93650 - browsergym.core.env - DEBUG - Action executed
49
+ 2025-01-21 11:43:03,488 - 93650 - browsergym.core.env - DEBUG - Active page checked
50
+ 2025-01-21 11:43:03,489 - 93650 - browsergym.core.env - DEBUG - User message done
51
+ 2025-01-21 11:43:03,489 - 93650 - browsergym.core.env - DEBUG - Initiating task validation
52
+ 2025-01-21 11:43:03,492 - 93650 - browsergym.core.env - DEBUG - Task validation done
53
+ 2025-01-21 11:43:03,493 - 93650 - browsergym.core.observation - DEBUG - Marking frame ''
54
+ 2025-01-21 11:43:03,619 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
55
+ 2025-01-21 11:43:03,619 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
56
+ 2025-01-21 11:43:03,619 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 5934
57
+ 2025-01-21 11:43:03,621 - 93650 - browsergym.core.env - DEBUG - Observation extracted
58
+ 2025-01-21 11:43:03,627 - 93650 - browsergym.experiments.loop - DEBUG - Environment stepped.
59
+ 2025-01-21 11:43:03,628 - 93650 - browsergym.experiments.loop - DEBUG - Starting step 1.
60
+ 2025-01-21 11:43:03,637 - 93650 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
61
+ 2025-01-21 11:43:03,637 - 93650 - httpcore.http11 - DEBUG - send_request_headers.complete
62
+ 2025-01-21 11:43:03,637 - 93650 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
63
+ 2025-01-21 11:43:03,637 - 93650 - httpcore.http11 - DEBUG - send_request_body.complete
64
+ 2025-01-21 11:43:03,637 - 93650 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
65
+ 2025-01-21 11:43:04,458 - 93650 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Length', b'1495'), (b'Content-Type', b'application/json'), (b'apim-request-id', b'baed9237-d544-470c-a993-7249d61b5e03'), (b'Strict-Transport-Security', b'max-age=31536000; includeSubDomains; preload'), (b'x-content-type-options', b'nosniff'), (b'x-accel-buffering', b'no'), (b'x-ms-rai-invoked', b'true'), (b'x-request-id', b'aa80493f-1934-4033-b1f4-6191402226ac'), (b'x-ms-region', b'East US'), (b'x-ratelimit-remaining-requests', b'118364'), (b'x-ratelimit-remaining-tokens', b'11729974'), (b'azureml-model-session', b'd032-20241220150939'), (b'x-envoy-upstream-service-time', b'716'), (b'x-ms-client-request-id', b'baed9237-d544-470c-a993-7249d61b5e03'), (b'Date', b'Tue, 21 Jan 2025 16:43:04 GMT')])
66
+ 2025-01-21 11:43:04,458 - 93650 - httpx - INFO - HTTP Request: POST https://ui-assist.openai.azure.com/openai/deployments/gpt-4o-mini-2024-07-18/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
67
+ 2025-01-21 11:43:04,458 - 93650 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
68
+ 2025-01-21 11:43:04,458 - 93650 - httpcore.http11 - DEBUG - receive_response_body.complete
69
+ 2025-01-21 11:43:04,458 - 93650 - httpcore.http11 - DEBUG - response_closed.started
70
+ 2025-01-21 11:43:04,458 - 93650 - httpcore.http11 - DEBUG - response_closed.complete
71
+ 2025-01-21 11:43:04,462 - 93650 - browsergym.experiments.loop - DEBUG - Agent chose action:
72
+ click('30')
73
+ 2025-01-21 11:43:04,469 - 93650 - browsergym.experiments.loop - DEBUG - Step info saved.
74
+ 2025-01-21 11:43:04,469 - 93650 - browsergym.experiments.loop - INFO - The checkbox for "SQkB" is already checked, and I need to check the checkbox for "ug" before submitting the form. The checkbox for "ug" is currently unchecked. I will click on the checkbox for "ug" to check it.
75
+
76
+ action:
77
+ click('30')
78
+
79
+ 2025-01-21 11:43:04,472 - 93650 - browsergym.experiments.loop - DEBUG - Chat info sent.
80
+ 2025-01-21 11:43:04,473 - 93650 - browsergym.experiments.loop - DEBUG - Sending action to environment.
81
+ 2025-01-21 11:43:04,473 - 93650 - browsergym.core.env - DEBUG - Executing action
82
+ 2025-01-21 11:43:04,545 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
83
+ 2025-01-21 11:43:04,546 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
84
+ 2025-01-21 11:43:04,548 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
85
+ 2025-01-21 11:43:04,549 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
86
+ 2025-01-21 11:43:04,550 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
87
+ 2025-01-21 11:43:04,551 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
88
+ 2025-01-21 11:43:04,647 - 93650 - browsergym.core.env - DEBUG - Action executed
89
+ 2025-01-21 11:43:05,163 - 93650 - browsergym.core.env - DEBUG - Active page checked
90
+ 2025-01-21 11:43:05,164 - 93650 - browsergym.core.env - DEBUG - User message done
91
+ 2025-01-21 11:43:05,164 - 93650 - browsergym.core.env - DEBUG - Initiating task validation
92
+ 2025-01-21 11:43:05,167 - 93650 - browsergym.core.env - DEBUG - Task validation done
93
+ 2025-01-21 11:43:05,167 - 93650 - browsergym.core.observation - DEBUG - Marking frame ''
94
+ 2025-01-21 11:43:05,304 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
95
+ 2025-01-21 11:43:05,304 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
96
+ 2025-01-21 11:43:05,304 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 6081
97
+ 2025-01-21 11:43:05,305 - 93650 - browsergym.core.env - DEBUG - Observation extracted
98
+ 2025-01-21 11:43:05,312 - 93650 - browsergym.experiments.loop - DEBUG - Environment stepped.
99
+ 2025-01-21 11:43:05,312 - 93650 - browsergym.experiments.loop - DEBUG - Starting step 2.
100
+ 2025-01-21 11:43:05,321 - 93650 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
101
+ 2025-01-21 11:43:05,322 - 93650 - httpcore.http11 - DEBUG - send_request_headers.complete
102
+ 2025-01-21 11:43:05,322 - 93650 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
103
+ 2025-01-21 11:43:05,322 - 93650 - httpcore.http11 - DEBUG - send_request_body.complete
104
+ 2025-01-21 11:43:05,322 - 93650 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
105
+ 2025-01-21 11:43:06,252 - 93650 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Length', b'1514'), (b'Content-Type', b'application/json'), (b'apim-request-id', b'62206cb6-4413-4bc0-a655-76cbf09a5b34'), (b'Strict-Transport-Security', b'max-age=31536000; includeSubDomains; preload'), (b'x-content-type-options', b'nosniff'), (b'x-accel-buffering', b'no'), (b'x-ms-rai-invoked', b'true'), (b'x-request-id', b'91816859-c74c-4d3e-b81d-cf6af557aad2'), (b'x-ms-region', b'East US'), (b'x-ratelimit-remaining-requests', b'118363'), (b'x-ratelimit-remaining-tokens', b'11712162'), (b'azureml-model-session', b'd025-20241218102944'), (b'x-envoy-upstream-service-time', b'825'), (b'x-ms-client-request-id', b'62206cb6-4413-4bc0-a655-76cbf09a5b34'), (b'Date', b'Tue, 21 Jan 2025 16:43:06 GMT')])
106
+ 2025-01-21 11:43:06,252 - 93650 - httpx - INFO - HTTP Request: POST https://ui-assist.openai.azure.com/openai/deployments/gpt-4o-mini-2024-07-18/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
107
+ 2025-01-21 11:43:06,252 - 93650 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
108
+ 2025-01-21 11:43:06,253 - 93650 - httpcore.http11 - DEBUG - receive_response_body.complete
109
+ 2025-01-21 11:43:06,253 - 93650 - httpcore.http11 - DEBUG - response_closed.started
110
+ 2025-01-21 11:43:06,253 - 93650 - httpcore.http11 - DEBUG - response_closed.complete
111
+ 2025-01-21 11:43:06,257 - 93650 - browsergym.experiments.loop - DEBUG - Agent chose action:
112
+ click('33')
113
+ 2025-01-21 11:43:06,263 - 93650 - browsergym.experiments.loop - DEBUG - Step info saved.
114
+ 2025-01-21 11:43:06,263 - 93650 - browsergym.experiments.loop - INFO - The checkboxes for "SQkB" and "ug" are already checked, which is what I need for the goal. The next step is to click the "Submit" button to complete the task. The button is visible and clickable, so I will proceed with that action.
115
+
116
+ action:
117
+ click('33')
118
+
119
+ 2025-01-21 11:43:06,267 - 93650 - browsergym.experiments.loop - DEBUG - Chat info sent.
120
+ 2025-01-21 11:43:06,267 - 93650 - browsergym.experiments.loop - DEBUG - Sending action to environment.
121
+ 2025-01-21 11:43:06,267 - 93650 - browsergym.core.env - DEBUG - Executing action
122
+ 2025-01-21 11:43:06,327 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
123
+ 2025-01-21 11:43:06,328 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
124
+ 2025-01-21 11:43:06,329 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
125
+ 2025-01-21 11:43:06,330 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
126
+ 2025-01-21 11:43:06,331 - 93650 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
127
+ 2025-01-21 11:43:06,430 - 93650 - browsergym.core.env - DEBUG - Action executed
128
+ 2025-01-21 11:43:06,945 - 93650 - browsergym.core.env - DEBUG - Active page checked
129
+ 2025-01-21 11:43:06,946 - 93650 - browsergym.core.env - DEBUG - User message done
130
+ 2025-01-21 11:43:06,946 - 93650 - browsergym.core.env - DEBUG - Initiating task validation
131
+ 2025-01-21 11:43:06,949 - 93650 - browsergym.core.env - DEBUG - Task validation done
132
+ 2025-01-21 11:43:06,949 - 93650 - browsergym.core.observation - DEBUG - Marking frame ''
133
+ 2025-01-21 11:43:07,088 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
134
+ 2025-01-21 11:43:07,088 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
135
+ 2025-01-21 11:43:07,088 - 93650 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 6056
136
+ 2025-01-21 11:43:07,090 - 93650 - browsergym.core.env - DEBUG - Observation extracted
137
+ 2025-01-21 11:43:07,097 - 93650 - browsergym.experiments.loop - DEBUG - Environment stepped.
138
+ 2025-01-21 11:43:07,103 - 93650 - browsergym.experiments.loop - INFO - Saving summary info.
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/goal_object.pkl.gz ADDED
Binary file (101 Bytes). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/package_versions.txt ADDED
@@ -0,0 +1,231 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Faker==33.3.1
2
+ Farama-Notifications==0.0.4
3
+ Flask==3.1.0
4
+ GitPython==3.1.44
5
+ Jinja2==3.1.5
6
+ MarkupSafe==2.1.5
7
+ PyYAML==6.0.2
8
+ Pygments==2.19.1
9
+ SQLAlchemy==2.0.37
10
+ Werkzeug==3.1.3
11
+ agentlab==0.3.2
12
+ agentlab==0.3.2
13
+ aiofiles==23.2.1
14
+ aiohappyeyeballs==2.4.4
15
+ aiohttp-cors==0.7.0
16
+ aiohttp==3.11.11
17
+ aiolimiter==1.2.1
18
+ aiosignal==1.3.2
19
+ annotated-types==0.7.0
20
+ anyio==4.8.0
21
+ asttokens==3.0.0
22
+ attrs==24.3.0
23
+ beartype==0.12.0
24
+ beautifulsoup4==4.12.3
25
+ black==24.10.0
26
+ blacken-docs==1.19.1
27
+ blinker==1.9.0
28
+ browsergym-assistantbench==0.13.3
29
+ browsergym-core==0.13.3
30
+ browsergym-experiments==0.13.3
31
+ browsergym-miniwob==0.13.3
32
+ browsergym-visualwebarena==0.13.3
33
+ browsergym-webarena==0.13.3
34
+ browsergym-workarena==0.4.1
35
+ browsergym==0.13.3
36
+ cachetools==5.5.0
37
+ certifi==2024.12.14
38
+ cfgv==3.4.0
39
+ charset-normalizer==3.4.1
40
+ click==8.1.8
41
+ cloudpickle==3.1.1
42
+ colorama==0.4.6
43
+ colorama==0.4.6
44
+ colorful==0.5.6
45
+ contexttimer==0.3.3
46
+ contourpy==1.3.1
47
+ cycler==0.12.1
48
+ dask==2024.12.1
49
+ dataclasses-json==0.6.7
50
+ datasets==3.2.0
51
+ decorator==5.1.1
52
+ dill==0.3.8
53
+ distlib==0.3.9
54
+ distributed==2024.12.1
55
+ distro==1.9.0
56
+ english-words==2.0.1
57
+ evaluate==0.4.3
58
+ execnet==2.1.1
59
+ executing==2.1.0
60
+ fastapi==0.115.6
61
+ ffmpy==0.5.0
62
+ filelock==3.16.1
63
+ flaky==3.8.1
64
+ fonttools==4.55.3
65
+ frozenlist==1.5.0
66
+ fsspec==2024.9.0
67
+ gitdb==4.0.12
68
+ google-api-core==2.24.0
69
+ google-auth==2.37.0
70
+ googleapis-common-protos==1.66.0
71
+ gradio==5.12.0
72
+ gradio_client==1.5.4
73
+ greenlet==3.1.1
74
+ grpcio==1.69.0
75
+ gymnasium==1.0.0
76
+ h11==0.14.0
77
+ httpcore==1.0.7
78
+ httpx-sse==0.4.0
79
+ httpx==0.28.1
80
+ huggingface-hub==0.27.1
81
+ identify==2.6.5
82
+ idna==3.10
83
+ imageio==2.36.1
84
+ importlib_metadata==8.5.0
85
+ iniconfig==2.0.0
86
+ ipython==8.31.0
87
+ itsdangerous==2.2.0
88
+ jedi==0.19.2
89
+ jiter==0.8.2
90
+ joblib==1.4.2
91
+ jsonpatch==1.33
92
+ jsonpointer==3.0.0
93
+ jsonschema-specifications==2024.10.1
94
+ jsonschema==4.23.0
95
+ kiwisolver==1.4.8
96
+ langchain-community==0.3.14
97
+ langchain-core==0.3.29
98
+ langchain-text-splitters==0.3.5
99
+ langchain==0.3.14
100
+ langsmith==0.2.10
101
+ lazy_loader==0.4
102
+ libvisualwebarena==0.0.15
103
+ libwebarena==0.0.4
104
+ linkify-it-py==2.0.3
105
+ locket==1.0.0
106
+ lxml==5.3.0
107
+ markdown-it-py==3.0.0
108
+ marshmallow==3.25.1
109
+ matplotlib-inline==0.1.7
110
+ matplotlib==3.10.0
111
+ mdit-py-plugins==0.4.2
112
+ mdurl==0.1.2
113
+ memray==1.15.0
114
+ mpmath==1.3.0
115
+ msgpack==1.1.0
116
+ multidict==6.1.0
117
+ multiprocess==0.70.16
118
+ mypy-extensions==1.0.0
119
+ networkx==3.4.2
120
+ nltk==3.9.1
121
+ nodeenv==1.9.1
122
+ numpy==1.26.4
123
+ openai==1.59.7
124
+ opencensus-context==0.1.3
125
+ opencensus==0.11.4
126
+ orjson==3.10.14
127
+ packaging==24.2
128
+ pandas==2.2.3
129
+ parso==0.8.4
130
+ partd==1.4.2
131
+ pathspec==0.12.1
132
+ pexpect==4.9.0
133
+ pillow==11.1.0
134
+ pip==24.3.1
135
+ platformdirs==4.3.6
136
+ playwright==1.49.1
137
+ pluggy==1.5.0
138
+ portalocker==3.1.1
139
+ pre_commit==4.0.1
140
+ prometheus_client==0.21.1
141
+ prompt_toolkit==3.0.48
142
+ propcache==0.2.1
143
+ proto-plus==1.25.0
144
+ protobuf==5.29.3
145
+ psutil==6.1.0
146
+ psutil==6.1.1
147
+ ptyprocess==0.7.0
148
+ pure_eval==0.2.3
149
+ py-spy==0.4.0
150
+ pyarrow==19.0.0
151
+ pyasn1==0.6.1
152
+ pyasn1_modules==0.4.1
153
+ pydantic-settings==2.7.1
154
+ pydantic==2.10.5
155
+ pydantic_core==2.27.2
156
+ pydub==0.25.1
157
+ pyee==12.0.0
158
+ pyparsing==3.2.1
159
+ pytest-base-url==2.1.0
160
+ pytest-playwright==0.6.2
161
+ pytest-xdist==3.6.1
162
+ pytest==8.3.4
163
+ python-dateutil==2.9.0.post0
164
+ python-dotenv==1.0.1
165
+ python-multipart==0.0.20
166
+ python-slugify==8.0.4
167
+ pytz==2024.2
168
+ ray==2.40.0
169
+ referencing==0.35.1
170
+ regex==2024.11.6
171
+ requests-toolbelt==1.0.0
172
+ requests==2.32.3
173
+ rich==13.9.4
174
+ rpds-py==0.22.3
175
+ rsa==4.9
176
+ ruff==0.9.2
177
+ sacrebleu==2.5.1
178
+ safehttpx==0.1.6
179
+ safetensors==0.5.2
180
+ scikit-image==0.25.0
181
+ scipy==1.15.1
182
+ semantic-version==2.10.0
183
+ setproctitle==1.2.2
184
+ setuptools==75.8.0
185
+ shellingham==1.5.4
186
+ six==1.17.0
187
+ smart-open==7.1.0
188
+ smmap==5.0.2
189
+ sniffio==1.3.1
190
+ sortedcontainers==2.4.0
191
+ soupsieve==2.6
192
+ stack-data==0.6.3
193
+ starlette==0.41.3
194
+ sympy==1.13.3
195
+ tabulate==0.9.0
196
+ tblib==3.0.0
197
+ tenacity==9.0.0
198
+ text-generation==0.7.0
199
+ text-unidecode==1.3
200
+ textual==1.0.0
201
+ tifffile==2025.1.10
202
+ tiktoken==0.8.0
203
+ tokenize_rt==6.1.0
204
+ tokenizers==0.21.0
205
+ tomlkit==0.13.2
206
+ toolz==1.0.0
207
+ torch==2.2.2
208
+ tornado==6.4.2
209
+ tqdm==4.67.1
210
+ traitlets==5.14.3
211
+ transformers==4.48.0
212
+ typer==0.15.1
213
+ types-requests==2.32.0.20241016
214
+ types-tqdm==4.67.0.20241221
215
+ typing-inspect==0.9.0
216
+ typing_extensions==4.12.2
217
+ tzdata==2024.2
218
+ uc-micro-py==1.0.3
219
+ urllib3==2.3.0
220
+ uvicorn==0.34.0
221
+ virtualenv==20.29.0
222
+ wcwidth==0.2.13
223
+ weblinx-browsergym==0.0.1.dev14
224
+ weblinx==0.3.2
225
+ websockets==14.1
226
+ wheel==0.45.1
227
+ wrapt==1.17.2
228
+ xxhash==3.5.0
229
+ yarl==1.18.3
230
+ zict==3.0.0
231
+ zipp==3.21.0
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/screenshot_step_0.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/screenshot_step_1.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/screenshot_step_2.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/screenshot_step_3.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/step_0.pkl.gz ADDED
Binary file (8.09 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/step_1.pkl.gz ADDED
Binary file (8.21 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/step_2.pkl.gz ADDED
Binary file (8.24 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/step_3.pkl.gz ADDED
Binary file (5.93 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20/summary_info.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "n_steps": 3,
3
+ "cum_reward": 1.0,
4
+ "cum_raw_reward": 0,
5
+ "err_msg": null,
6
+ "stack_trace": null,
7
+ "stats.cum_steps": 4,
8
+ "stats.cum_n_token_goal": 27,
9
+ "stats.max_n_token_goal": 9,
10
+ "stats.cum_n_token_url": 93,
11
+ "stats.max_n_token_url": 31,
12
+ "stats.cum_n_token_focused_element_bid": 3,
13
+ "stats.max_n_token_focused_element_bid": 1,
14
+ "stats.cum_n_token_last_action": 8,
15
+ "stats.max_n_token_last_action": 4,
16
+ "stats.cum_n_token_last_action_error": 0,
17
+ "stats.max_n_token_last_action_error": 0,
18
+ "stats.cum_n_token_dom_txt": 2892,
19
+ "stats.max_n_token_dom_txt": 966,
20
+ "stats.cum_n_token_axtree_txt": 769,
21
+ "stats.max_n_token_axtree_txt": 257,
22
+ "stats.cum_n_token_pruned_html": 1014,
23
+ "stats.max_n_token_pruned_html": 340,
24
+ "stats.cum_n_retry_llm": 3,
25
+ "stats.max_n_retry_llm": 1,
26
+ "stats.cum_n_retry": 0.0,
27
+ "stats.max_n_retry": 0.0,
28
+ "stats.cum_busted_retry": 0,
29
+ "stats.max_busted_retry": 0,
30
+ "stats.cum_input_tokens": 4489,
31
+ "stats.max_input_tokens": 1514,
32
+ "stats.cum_output_tokens": 224,
33
+ "stats.max_output_tokens": 83,
34
+ "stats.cum_cost": 0.0008077499999999999,
35
+ "stats.max_cost": 0.0002715,
36
+ "stats.cum_n_token_agent_messages": 4670,
37
+ "stats.max_n_token_agent_messages": 1573,
38
+ "stats.cum_step_elapsed": 5.8213889598846436,
39
+ "stats.max_step_elapsed": 4.13917088508606,
40
+ "stats.cum_agent_elapsed": 5.303440809249878,
41
+ "stats.max_agent_elapsed": 3.529852867126465,
42
+ "terminated": true,
43
+ "truncated": false
44
+ }
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/exp_args.pkl ADDED
Binary file (2.31 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/experiment.log ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-01-21 11:42:54,795 - 93649 - browsergym.experiments.loop - INFO - Running experiment GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7 in:
2
+ /Users/t.lesellierdechezell/agentlab_results/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7
3
+ 2025-01-21 11:42:54,875 - 93649 - browsergym.experiments.loop - DEBUG - Agent created.
4
+ 2025-01-21 11:42:54,883 - 93649 - browsergym.experiments.loop - DEBUG - Environment created.
5
+ 2025-01-21 11:42:54,883 - 93649 - asyncio - DEBUG - Using selector: KqueueSelector
6
+ 2025-01-21 11:42:58,044 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='about:blank'>
7
+ 2025-01-21 11:42:58,067 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
8
+ 2025-01-21 11:42:58,228 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
9
+ 2025-01-21 11:42:58,247 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
10
+ 2025-01-21 11:42:58,455 - 93649 - browsergym.core.observation - DEBUG - Marking frame ''
11
+ 2025-01-21 11:42:59,016 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
12
+ 2025-01-21 11:42:59,016 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
13
+ 2025-01-21 11:42:59,016 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 3988
14
+ 2025-01-21 11:42:59,035 - 93649 - browsergym.experiments.loop - DEBUG - Environment reset.
15
+ 2025-01-21 11:42:59,035 - 93649 - browsergym.experiments.loop - DEBUG - Starting step 0.
16
+ 2025-01-21 11:42:59,836 - 93649 - httpcore.connection - DEBUG - connect_tcp.started host='ui-assist.openai.azure.com' port=443 local_address=None timeout=5.0 socket_options=None
17
+ 2025-01-21 11:43:00,130 - 93649 - httpcore.connection - DEBUG - connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x19cd2c850>
18
+ 2025-01-21 11:43:00,130 - 93649 - httpcore.connection - DEBUG - start_tls.started ssl_context=<ssl.SSLContext object at 0x19c89c7a0> server_hostname='ui-assist.openai.azure.com' timeout=5.0
19
+ 2025-01-21 11:43:00,658 - 93649 - httpcore.connection - DEBUG - start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x19cd2f710>
20
+ 2025-01-21 11:43:00,658 - 93649 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
21
+ 2025-01-21 11:43:00,659 - 93649 - httpcore.http11 - DEBUG - send_request_headers.complete
22
+ 2025-01-21 11:43:00,659 - 93649 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
23
+ 2025-01-21 11:43:00,659 - 93649 - httpcore.http11 - DEBUG - send_request_body.complete
24
+ 2025-01-21 11:43:00,659 - 93649 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
25
+ 2025-01-21 11:43:01,371 - 93649 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Length', b'1341'), (b'Content-Type', b'application/json'), (b'apim-request-id', b'6e259ead-d704-441d-8bf5-abaf9b3ad975'), (b'Strict-Transport-Security', b'max-age=31536000; includeSubDomains; preload'), (b'x-content-type-options', b'nosniff'), (b'x-accel-buffering', b'no'), (b'x-ms-rai-invoked', b'true'), (b'x-request-id', b'3c0ef0c5-6827-4a02-946e-cc21c5c4bed9'), (b'x-ms-region', b'East US'), (b'x-ratelimit-remaining-requests', b'118369'), (b'x-ratelimit-remaining-tokens', b'11819230'), (b'azureml-model-session', b'd033-20241226130756'), (b'x-envoy-upstream-service-time', b'610'), (b'x-ms-client-request-id', b'6e259ead-d704-441d-8bf5-abaf9b3ad975'), (b'Date', b'Tue, 21 Jan 2025 16:43:01 GMT')])
26
+ 2025-01-21 11:43:01,371 - 93649 - httpx - INFO - HTTP Request: POST https://ui-assist.openai.azure.com/openai/deployments/gpt-4o-mini-2024-07-18/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
27
+ 2025-01-21 11:43:01,372 - 93649 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
28
+ 2025-01-21 11:43:01,372 - 93649 - httpcore.http11 - DEBUG - receive_response_body.complete
29
+ 2025-01-21 11:43:01,372 - 93649 - httpcore.http11 - DEBUG - response_closed.started
30
+ 2025-01-21 11:43:01,372 - 93649 - httpcore.http11 - DEBUG - response_closed.complete
31
+ 2025-01-21 11:43:01,551 - 93649 - browsergym.experiments.loop - DEBUG - Agent chose action:
32
+ click('27')
33
+ 2025-01-21 11:43:01,558 - 93649 - browsergym.experiments.loop - DEBUG - Step info saved.
34
+ 2025-01-21 11:43:01,558 - 93649 - browsergym.experiments.loop - INFO - I need to select the checkbox labeled "Neb" before clicking the "Submit" button. The checkbox for "Neb" has the bid '27'. I will perform a click action on this checkbox to select it.
35
+
36
+ action:
37
+ click('27')
38
+
39
+ 2025-01-21 11:43:01,562 - 93649 - browsergym.experiments.loop - DEBUG - Chat info sent.
40
+ 2025-01-21 11:43:01,562 - 93649 - browsergym.experiments.loop - DEBUG - Sending action to environment.
41
+ 2025-01-21 11:43:01,562 - 93649 - browsergym.core.env - DEBUG - Executing action
42
+ 2025-01-21 11:43:01,646 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
43
+ 2025-01-21 11:43:01,648 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
44
+ 2025-01-21 11:43:01,649 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
45
+ 2025-01-21 11:43:01,651 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
46
+ 2025-01-21 11:43:01,678 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
47
+ 2025-01-21 11:43:01,680 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
48
+ 2025-01-21 11:43:01,779 - 93649 - browsergym.core.env - DEBUG - Action executed
49
+ 2025-01-21 11:43:02,292 - 93649 - browsergym.core.env - DEBUG - Active page checked
50
+ 2025-01-21 11:43:02,292 - 93649 - browsergym.core.env - DEBUG - User message done
51
+ 2025-01-21 11:43:02,292 - 93649 - browsergym.core.env - DEBUG - Initiating task validation
52
+ 2025-01-21 11:43:02,297 - 93649 - browsergym.core.env - DEBUG - Task validation done
53
+ 2025-01-21 11:43:02,297 - 93649 - browsergym.core.observation - DEBUG - Marking frame ''
54
+ 2025-01-21 11:43:02,439 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
55
+ 2025-01-21 11:43:02,439 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
56
+ 2025-01-21 11:43:02,439 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 4346
57
+ 2025-01-21 11:43:02,442 - 93649 - browsergym.core.env - DEBUG - Observation extracted
58
+ 2025-01-21 11:43:02,451 - 93649 - browsergym.experiments.loop - DEBUG - Environment stepped.
59
+ 2025-01-21 11:43:02,451 - 93649 - browsergym.experiments.loop - DEBUG - Starting step 1.
60
+ 2025-01-21 11:43:02,464 - 93649 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
61
+ 2025-01-21 11:43:02,464 - 93649 - httpcore.http11 - DEBUG - send_request_headers.complete
62
+ 2025-01-21 11:43:02,465 - 93649 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
63
+ 2025-01-21 11:43:02,465 - 93649 - httpcore.http11 - DEBUG - send_request_body.complete
64
+ 2025-01-21 11:43:02,465 - 93649 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
65
+ 2025-01-21 11:43:03,261 - 93649 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Length', b'1488'), (b'Content-Type', b'application/json'), (b'apim-request-id', b'11dba28a-3e3c-447f-96a7-68381a05619a'), (b'Strict-Transport-Security', b'max-age=31536000; includeSubDomains; preload'), (b'x-content-type-options', b'nosniff'), (b'x-accel-buffering', b'no'), (b'x-ms-rai-invoked', b'true'), (b'x-request-id', b'd75df10d-5bf3-4773-b1e5-82d36bdf2066'), (b'x-ms-region', b'East US'), (b'x-ratelimit-remaining-requests', b'118365'), (b'x-ratelimit-remaining-tokens', b'11747773'), (b'azureml-model-session', b'd104-20250114135242'), (b'x-envoy-upstream-service-time', b'691'), (b'x-ms-client-request-id', b'11dba28a-3e3c-447f-96a7-68381a05619a'), (b'Date', b'Tue, 21 Jan 2025 16:43:02 GMT')])
66
+ 2025-01-21 11:43:03,261 - 93649 - httpx - INFO - HTTP Request: POST https://ui-assist.openai.azure.com/openai/deployments/gpt-4o-mini-2024-07-18/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
67
+ 2025-01-21 11:43:03,261 - 93649 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
68
+ 2025-01-21 11:43:03,262 - 93649 - httpcore.http11 - DEBUG - receive_response_body.complete
69
+ 2025-01-21 11:43:03,262 - 93649 - httpcore.http11 - DEBUG - response_closed.started
70
+ 2025-01-21 11:43:03,262 - 93649 - httpcore.http11 - DEBUG - response_closed.complete
71
+ 2025-01-21 11:43:03,267 - 93649 - browsergym.experiments.loop - DEBUG - Agent chose action:
72
+ click('33')
73
+ 2025-01-21 11:43:03,273 - 93649 - browsergym.experiments.loop - DEBUG - Step info saved.
74
+ 2025-01-21 11:43:03,273 - 93649 - browsergym.experiments.loop - INFO - The checkbox for "Neb" is already checked, which is the desired state. The next step is to submit the form by clicking the "Submit" button. The button is visible and clickable, so I will proceed to click it.
75
+
76
+ action:
77
+ click('33')
78
+
79
+ 2025-01-21 11:43:03,277 - 93649 - browsergym.experiments.loop - DEBUG - Chat info sent.
80
+ 2025-01-21 11:43:03,277 - 93649 - browsergym.experiments.loop - DEBUG - Sending action to environment.
81
+ 2025-01-21 11:43:03,277 - 93649 - browsergym.core.env - DEBUG - Executing action
82
+ 2025-01-21 11:43:03,344 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
83
+ 2025-01-21 11:43:03,345 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
84
+ 2025-01-21 11:43:03,347 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
85
+ 2025-01-21 11:43:03,348 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
86
+ 2025-01-21 11:43:03,349 - 93649 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-checkboxes.html'>
87
+ 2025-01-21 11:43:03,447 - 93649 - browsergym.core.env - DEBUG - Action executed
88
+ 2025-01-21 11:43:03,963 - 93649 - browsergym.core.env - DEBUG - Active page checked
89
+ 2025-01-21 11:43:03,963 - 93649 - browsergym.core.env - DEBUG - User message done
90
+ 2025-01-21 11:43:03,963 - 93649 - browsergym.core.env - DEBUG - Initiating task validation
91
+ 2025-01-21 11:43:03,967 - 93649 - browsergym.core.env - DEBUG - Task validation done
92
+ 2025-01-21 11:43:03,967 - 93649 - browsergym.core.observation - DEBUG - Marking frame ''
93
+ 2025-01-21 11:43:04,103 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
94
+ 2025-01-21 11:43:04,103 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
95
+ 2025-01-21 11:43:04,103 - 93649 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 4325
96
+ 2025-01-21 11:43:04,104 - 93649 - browsergym.core.env - DEBUG - Observation extracted
97
+ 2025-01-21 11:43:04,112 - 93649 - browsergym.experiments.loop - DEBUG - Environment stepped.
98
+ 2025-01-21 11:43:04,117 - 93649 - browsergym.experiments.loop - INFO - Saving summary info.
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/goal_object.pkl.gz ADDED
Binary file (96 Bytes). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/package_versions.txt ADDED
@@ -0,0 +1,231 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Faker==33.3.1
2
+ Farama-Notifications==0.0.4
3
+ Flask==3.1.0
4
+ GitPython==3.1.44
5
+ Jinja2==3.1.5
6
+ MarkupSafe==2.1.5
7
+ PyYAML==6.0.2
8
+ Pygments==2.19.1
9
+ SQLAlchemy==2.0.37
10
+ Werkzeug==3.1.3
11
+ agentlab==0.3.2
12
+ agentlab==0.3.2
13
+ aiofiles==23.2.1
14
+ aiohappyeyeballs==2.4.4
15
+ aiohttp-cors==0.7.0
16
+ aiohttp==3.11.11
17
+ aiolimiter==1.2.1
18
+ aiosignal==1.3.2
19
+ annotated-types==0.7.0
20
+ anyio==4.8.0
21
+ asttokens==3.0.0
22
+ attrs==24.3.0
23
+ beartype==0.12.0
24
+ beautifulsoup4==4.12.3
25
+ black==24.10.0
26
+ blacken-docs==1.19.1
27
+ blinker==1.9.0
28
+ browsergym-assistantbench==0.13.3
29
+ browsergym-core==0.13.3
30
+ browsergym-experiments==0.13.3
31
+ browsergym-miniwob==0.13.3
32
+ browsergym-visualwebarena==0.13.3
33
+ browsergym-webarena==0.13.3
34
+ browsergym-workarena==0.4.1
35
+ browsergym==0.13.3
36
+ cachetools==5.5.0
37
+ certifi==2024.12.14
38
+ cfgv==3.4.0
39
+ charset-normalizer==3.4.1
40
+ click==8.1.8
41
+ cloudpickle==3.1.1
42
+ colorama==0.4.6
43
+ colorama==0.4.6
44
+ colorful==0.5.6
45
+ contexttimer==0.3.3
46
+ contourpy==1.3.1
47
+ cycler==0.12.1
48
+ dask==2024.12.1
49
+ dataclasses-json==0.6.7
50
+ datasets==3.2.0
51
+ decorator==5.1.1
52
+ dill==0.3.8
53
+ distlib==0.3.9
54
+ distributed==2024.12.1
55
+ distro==1.9.0
56
+ english-words==2.0.1
57
+ evaluate==0.4.3
58
+ execnet==2.1.1
59
+ executing==2.1.0
60
+ fastapi==0.115.6
61
+ ffmpy==0.5.0
62
+ filelock==3.16.1
63
+ flaky==3.8.1
64
+ fonttools==4.55.3
65
+ frozenlist==1.5.0
66
+ fsspec==2024.9.0
67
+ gitdb==4.0.12
68
+ google-api-core==2.24.0
69
+ google-auth==2.37.0
70
+ googleapis-common-protos==1.66.0
71
+ gradio==5.12.0
72
+ gradio_client==1.5.4
73
+ greenlet==3.1.1
74
+ grpcio==1.69.0
75
+ gymnasium==1.0.0
76
+ h11==0.14.0
77
+ httpcore==1.0.7
78
+ httpx-sse==0.4.0
79
+ httpx==0.28.1
80
+ huggingface-hub==0.27.1
81
+ identify==2.6.5
82
+ idna==3.10
83
+ imageio==2.36.1
84
+ importlib_metadata==8.5.0
85
+ iniconfig==2.0.0
86
+ ipython==8.31.0
87
+ itsdangerous==2.2.0
88
+ jedi==0.19.2
89
+ jiter==0.8.2
90
+ joblib==1.4.2
91
+ jsonpatch==1.33
92
+ jsonpointer==3.0.0
93
+ jsonschema-specifications==2024.10.1
94
+ jsonschema==4.23.0
95
+ kiwisolver==1.4.8
96
+ langchain-community==0.3.14
97
+ langchain-core==0.3.29
98
+ langchain-text-splitters==0.3.5
99
+ langchain==0.3.14
100
+ langsmith==0.2.10
101
+ lazy_loader==0.4
102
+ libvisualwebarena==0.0.15
103
+ libwebarena==0.0.4
104
+ linkify-it-py==2.0.3
105
+ locket==1.0.0
106
+ lxml==5.3.0
107
+ markdown-it-py==3.0.0
108
+ marshmallow==3.25.1
109
+ matplotlib-inline==0.1.7
110
+ matplotlib==3.10.0
111
+ mdit-py-plugins==0.4.2
112
+ mdurl==0.1.2
113
+ memray==1.15.0
114
+ mpmath==1.3.0
115
+ msgpack==1.1.0
116
+ multidict==6.1.0
117
+ multiprocess==0.70.16
118
+ mypy-extensions==1.0.0
119
+ networkx==3.4.2
120
+ nltk==3.9.1
121
+ nodeenv==1.9.1
122
+ numpy==1.26.4
123
+ openai==1.59.7
124
+ opencensus-context==0.1.3
125
+ opencensus==0.11.4
126
+ orjson==3.10.14
127
+ packaging==24.2
128
+ pandas==2.2.3
129
+ parso==0.8.4
130
+ partd==1.4.2
131
+ pathspec==0.12.1
132
+ pexpect==4.9.0
133
+ pillow==11.1.0
134
+ pip==24.3.1
135
+ platformdirs==4.3.6
136
+ playwright==1.49.1
137
+ pluggy==1.5.0
138
+ portalocker==3.1.1
139
+ pre_commit==4.0.1
140
+ prometheus_client==0.21.1
141
+ prompt_toolkit==3.0.48
142
+ propcache==0.2.1
143
+ proto-plus==1.25.0
144
+ protobuf==5.29.3
145
+ psutil==6.1.0
146
+ psutil==6.1.1
147
+ ptyprocess==0.7.0
148
+ pure_eval==0.2.3
149
+ py-spy==0.4.0
150
+ pyarrow==19.0.0
151
+ pyasn1==0.6.1
152
+ pyasn1_modules==0.4.1
153
+ pydantic-settings==2.7.1
154
+ pydantic==2.10.5
155
+ pydantic_core==2.27.2
156
+ pydub==0.25.1
157
+ pyee==12.0.0
158
+ pyparsing==3.2.1
159
+ pytest-base-url==2.1.0
160
+ pytest-playwright==0.6.2
161
+ pytest-xdist==3.6.1
162
+ pytest==8.3.4
163
+ python-dateutil==2.9.0.post0
164
+ python-dotenv==1.0.1
165
+ python-multipart==0.0.20
166
+ python-slugify==8.0.4
167
+ pytz==2024.2
168
+ ray==2.40.0
169
+ referencing==0.35.1
170
+ regex==2024.11.6
171
+ requests-toolbelt==1.0.0
172
+ requests==2.32.3
173
+ rich==13.9.4
174
+ rpds-py==0.22.3
175
+ rsa==4.9
176
+ ruff==0.9.2
177
+ sacrebleu==2.5.1
178
+ safehttpx==0.1.6
179
+ safetensors==0.5.2
180
+ scikit-image==0.25.0
181
+ scipy==1.15.1
182
+ semantic-version==2.10.0
183
+ setproctitle==1.2.2
184
+ setuptools==75.8.0
185
+ shellingham==1.5.4
186
+ six==1.17.0
187
+ smart-open==7.1.0
188
+ smmap==5.0.2
189
+ sniffio==1.3.1
190
+ sortedcontainers==2.4.0
191
+ soupsieve==2.6
192
+ stack-data==0.6.3
193
+ starlette==0.41.3
194
+ sympy==1.13.3
195
+ tabulate==0.9.0
196
+ tblib==3.0.0
197
+ tenacity==9.0.0
198
+ text-generation==0.7.0
199
+ text-unidecode==1.3
200
+ textual==1.0.0
201
+ tifffile==2025.1.10
202
+ tiktoken==0.8.0
203
+ tokenize_rt==6.1.0
204
+ tokenizers==0.21.0
205
+ tomlkit==0.13.2
206
+ toolz==1.0.0
207
+ torch==2.2.2
208
+ tornado==6.4.2
209
+ tqdm==4.67.1
210
+ traitlets==5.14.3
211
+ transformers==4.48.0
212
+ typer==0.15.1
213
+ types-requests==2.32.0.20241016
214
+ types-tqdm==4.67.0.20241221
215
+ typing-inspect==0.9.0
216
+ typing_extensions==4.12.2
217
+ tzdata==2024.2
218
+ uc-micro-py==1.0.3
219
+ urllib3==2.3.0
220
+ uvicorn==0.34.0
221
+ virtualenv==20.29.0
222
+ wcwidth==0.2.13
223
+ weblinx-browsergym==0.0.1.dev14
224
+ weblinx==0.3.2
225
+ websockets==14.1
226
+ wheel==0.45.1
227
+ wrapt==1.17.2
228
+ xxhash==3.5.0
229
+ yarl==1.18.3
230
+ zict==3.0.0
231
+ zipp==3.21.0
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/screenshot_step_0.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/screenshot_step_1.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/screenshot_step_2.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/step_0.pkl.gz ADDED
Binary file (8 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/step_1.pkl.gz ADDED
Binary file (8.15 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/step_2.pkl.gz ADDED
Binary file (5.86 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7/summary_info.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "n_steps": 2,
3
+ "cum_reward": 1.0,
4
+ "cum_raw_reward": 0,
5
+ "err_msg": null,
6
+ "stack_trace": null,
7
+ "stats.cum_steps": 3,
8
+ "stats.cum_n_token_goal": 12,
9
+ "stats.max_n_token_goal": 6,
10
+ "stats.cum_n_token_url": 62,
11
+ "stats.max_n_token_url": 31,
12
+ "stats.cum_n_token_focused_element_bid": 2,
13
+ "stats.max_n_token_focused_element_bid": 1,
14
+ "stats.cum_n_token_last_action": 4,
15
+ "stats.max_n_token_last_action": 4,
16
+ "stats.cum_n_token_last_action_error": 0,
17
+ "stats.max_n_token_last_action_error": 0,
18
+ "stats.cum_n_token_dom_txt": 1902,
19
+ "stats.max_n_token_dom_txt": 952,
20
+ "stats.cum_n_token_axtree_txt": 468,
21
+ "stats.max_n_token_axtree_txt": 235,
22
+ "stats.cum_n_token_pruned_html": 650,
23
+ "stats.max_n_token_pruned_html": 326,
24
+ "stats.cum_n_retry_llm": 2,
25
+ "stats.max_n_retry_llm": 1,
26
+ "stats.cum_n_retry": 0.0,
27
+ "stats.max_n_retry": 0.0,
28
+ "stats.cum_busted_retry": 0,
29
+ "stats.max_busted_retry": 0,
30
+ "stats.cum_input_tokens": 2889,
31
+ "stats.max_input_tokens": 1454,
32
+ "stats.cum_output_tokens": 122,
33
+ "stats.max_output_tokens": 63,
34
+ "stats.cum_cost": 0.0005065499999999999,
35
+ "stats.max_cost": 0.0002559,
36
+ "stats.cum_n_token_agent_messages": 3002,
37
+ "stats.max_n_token_agent_messages": 1512,
38
+ "stats.cum_step_elapsed": 5.017662048339844,
39
+ "stats.max_step_elapsed": 4.137955188751221,
40
+ "stats.cum_agent_elapsed": 3.1547908782958984,
41
+ "stats.max_agent_elapsed": 2.342694044113159,
42
+ "terminated": true,
43
+ "truncated": false
44
+ }
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/exp_args.pkl ADDED
Binary file (2.3 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/experiment.log ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-01-21 11:42:54,795 - 93648 - browsergym.experiments.loop - INFO - Running experiment GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14 in:
2
+ /Users/t.lesellierdechezell/agentlab_results/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14
3
+ 2025-01-21 11:42:54,875 - 93648 - browsergym.experiments.loop - DEBUG - Agent created.
4
+ 2025-01-21 11:42:54,883 - 93648 - browsergym.experiments.loop - DEBUG - Environment created.
5
+ 2025-01-21 11:42:54,883 - 93648 - asyncio - DEBUG - Using selector: KqueueSelector
6
+ 2025-01-21 11:42:58,030 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='about:blank'>
7
+ 2025-01-21 11:42:58,060 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
8
+ 2025-01-21 11:42:58,228 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
9
+ 2025-01-21 11:42:58,247 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
10
+ 2025-01-21 11:42:58,382 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
11
+ 2025-01-21 11:42:58,386 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
12
+ 2025-01-21 11:42:58,527 - 93648 - browsergym.core.observation - DEBUG - Marking frame ''
13
+ 2025-01-21 11:42:59,050 - 93648 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
14
+ 2025-01-21 11:42:59,050 - 93648 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
15
+ 2025-01-21 11:42:59,050 - 93648 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 4375
16
+ 2025-01-21 11:42:59,067 - 93648 - browsergym.experiments.loop - DEBUG - Environment reset.
17
+ 2025-01-21 11:42:59,067 - 93648 - browsergym.experiments.loop - DEBUG - Starting step 0.
18
+ 2025-01-21 11:42:59,861 - 93648 - httpcore.connection - DEBUG - connect_tcp.started host='ui-assist.openai.azure.com' port=443 local_address=None timeout=5.0 socket_options=None
19
+ 2025-01-21 11:43:00,130 - 93648 - httpcore.connection - DEBUG - connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x1a34f6f50>
20
+ 2025-01-21 11:43:00,130 - 93648 - httpcore.connection - DEBUG - start_tls.started ssl_context=<ssl.SSLContext object at 0x1a30987a0> server_hostname='ui-assist.openai.azure.com' timeout=5.0
21
+ 2025-01-21 11:43:00,674 - 93648 - httpcore.connection - DEBUG - start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x1a3486190>
22
+ 2025-01-21 11:43:00,674 - 93648 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
23
+ 2025-01-21 11:43:00,675 - 93648 - httpcore.http11 - DEBUG - send_request_headers.complete
24
+ 2025-01-21 11:43:00,675 - 93648 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
25
+ 2025-01-21 11:43:00,675 - 93648 - httpcore.http11 - DEBUG - send_request_body.complete
26
+ 2025-01-21 11:43:00,675 - 93648 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
27
+ 2025-01-21 11:43:01,484 - 93648 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Length', b'1319'), (b'Content-Type', b'application/json'), (b'apim-request-id', b'6922dc2a-c026-40a9-8137-566bd2e45a34'), (b'Strict-Transport-Security', b'max-age=31536000; includeSubDomains; preload'), (b'x-content-type-options', b'nosniff'), (b'x-accel-buffering', b'no'), (b'x-ms-rai-invoked', b'true'), (b'x-request-id', b'0d4fa485-1569-4b46-aebc-14dc43d4b684'), (b'x-ms-region', b'East US'), (b'x-ratelimit-remaining-requests', b'118369'), (b'x-ratelimit-remaining-tokens', b'11819053'), (b'azureml-model-session', b'd029-20250107021410'), (b'x-envoy-upstream-service-time', b'695'), (b'x-ms-client-request-id', b'6922dc2a-c026-40a9-8137-566bd2e45a34'), (b'Date', b'Tue, 21 Jan 2025 16:43:00 GMT')])
28
+ 2025-01-21 11:43:01,485 - 93648 - httpx - INFO - HTTP Request: POST https://ui-assist.openai.azure.com/openai/deployments/gpt-4o-mini-2024-07-18/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
29
+ 2025-01-21 11:43:01,485 - 93648 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
30
+ 2025-01-21 11:43:01,485 - 93648 - httpcore.http11 - DEBUG - receive_response_body.complete
31
+ 2025-01-21 11:43:01,485 - 93648 - httpcore.http11 - DEBUG - response_closed.started
32
+ 2025-01-21 11:43:01,485 - 93648 - httpcore.http11 - DEBUG - response_closed.complete
33
+ 2025-01-21 11:43:01,703 - 93648 - browsergym.experiments.loop - DEBUG - Agent chose action:
34
+ click('20')
35
+ 2025-01-21 11:43:01,710 - 93648 - browsergym.experiments.loop - DEBUG - Step info saved.
36
+ 2025-01-21 11:43:01,710 - 93648 - browsergym.experiments.loop - INFO - The dialog box is currently open, and the "Close" button is visible and focused. To close the dialog box, I will click the "Close" button, which has the bid '20'.
37
+
38
+ action:
39
+ click('20')
40
+
41
+ 2025-01-21 11:43:01,715 - 93648 - browsergym.experiments.loop - DEBUG - Chat info sent.
42
+ 2025-01-21 11:43:01,715 - 93648 - browsergym.experiments.loop - DEBUG - Sending action to environment.
43
+ 2025-01-21 11:43:01,715 - 93648 - browsergym.core.env - DEBUG - Executing action
44
+ 2025-01-21 11:43:01,797 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
45
+ 2025-01-21 11:43:01,798 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
46
+ 2025-01-21 11:43:01,800 - 93648 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
47
+ 2025-01-21 11:43:01,903 - 93648 - browsergym.core.env - DEBUG - Action executed
48
+ 2025-01-21 11:43:02,423 - 93648 - browsergym.core.env - DEBUG - Active page checked
49
+ 2025-01-21 11:43:02,424 - 93648 - browsergym.core.env - DEBUG - User message done
50
+ 2025-01-21 11:43:02,424 - 93648 - browsergym.core.env - DEBUG - Initiating task validation
51
+ 2025-01-21 11:43:02,428 - 93648 - browsergym.core.env - DEBUG - Task validation done
52
+ 2025-01-21 11:43:02,429 - 93648 - browsergym.core.observation - DEBUG - Marking frame ''
53
+ 2025-01-21 11:43:02,554 - 93648 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
54
+ 2025-01-21 11:43:02,554 - 93648 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
55
+ 2025-01-21 11:43:02,554 - 93648 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 1173
56
+ 2025-01-21 11:43:02,556 - 93648 - browsergym.core.env - DEBUG - Observation extracted
57
+ 2025-01-21 11:43:02,563 - 93648 - browsergym.experiments.loop - DEBUG - Environment stepped.
58
+ 2025-01-21 11:43:02,567 - 93648 - browsergym.experiments.loop - INFO - Saving summary info.
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/goal_object.pkl.gz ADDED
Binary file (105 Bytes). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/package_versions.txt ADDED
@@ -0,0 +1,231 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Faker==33.3.1
2
+ Farama-Notifications==0.0.4
3
+ Flask==3.1.0
4
+ GitPython==3.1.44
5
+ Jinja2==3.1.5
6
+ MarkupSafe==2.1.5
7
+ PyYAML==6.0.2
8
+ Pygments==2.19.1
9
+ SQLAlchemy==2.0.37
10
+ Werkzeug==3.1.3
11
+ agentlab==0.3.2
12
+ agentlab==0.3.2
13
+ aiofiles==23.2.1
14
+ aiohappyeyeballs==2.4.4
15
+ aiohttp-cors==0.7.0
16
+ aiohttp==3.11.11
17
+ aiolimiter==1.2.1
18
+ aiosignal==1.3.2
19
+ annotated-types==0.7.0
20
+ anyio==4.8.0
21
+ asttokens==3.0.0
22
+ attrs==24.3.0
23
+ beartype==0.12.0
24
+ beautifulsoup4==4.12.3
25
+ black==24.10.0
26
+ blacken-docs==1.19.1
27
+ blinker==1.9.0
28
+ browsergym-assistantbench==0.13.3
29
+ browsergym-core==0.13.3
30
+ browsergym-experiments==0.13.3
31
+ browsergym-miniwob==0.13.3
32
+ browsergym-visualwebarena==0.13.3
33
+ browsergym-webarena==0.13.3
34
+ browsergym-workarena==0.4.1
35
+ browsergym==0.13.3
36
+ cachetools==5.5.0
37
+ certifi==2024.12.14
38
+ cfgv==3.4.0
39
+ charset-normalizer==3.4.1
40
+ click==8.1.8
41
+ cloudpickle==3.1.1
42
+ colorama==0.4.6
43
+ colorama==0.4.6
44
+ colorful==0.5.6
45
+ contexttimer==0.3.3
46
+ contourpy==1.3.1
47
+ cycler==0.12.1
48
+ dask==2024.12.1
49
+ dataclasses-json==0.6.7
50
+ datasets==3.2.0
51
+ decorator==5.1.1
52
+ dill==0.3.8
53
+ distlib==0.3.9
54
+ distributed==2024.12.1
55
+ distro==1.9.0
56
+ english-words==2.0.1
57
+ evaluate==0.4.3
58
+ execnet==2.1.1
59
+ executing==2.1.0
60
+ fastapi==0.115.6
61
+ ffmpy==0.5.0
62
+ filelock==3.16.1
63
+ flaky==3.8.1
64
+ fonttools==4.55.3
65
+ frozenlist==1.5.0
66
+ fsspec==2024.9.0
67
+ gitdb==4.0.12
68
+ google-api-core==2.24.0
69
+ google-auth==2.37.0
70
+ googleapis-common-protos==1.66.0
71
+ gradio==5.12.0
72
+ gradio_client==1.5.4
73
+ greenlet==3.1.1
74
+ grpcio==1.69.0
75
+ gymnasium==1.0.0
76
+ h11==0.14.0
77
+ httpcore==1.0.7
78
+ httpx-sse==0.4.0
79
+ httpx==0.28.1
80
+ huggingface-hub==0.27.1
81
+ identify==2.6.5
82
+ idna==3.10
83
+ imageio==2.36.1
84
+ importlib_metadata==8.5.0
85
+ iniconfig==2.0.0
86
+ ipython==8.31.0
87
+ itsdangerous==2.2.0
88
+ jedi==0.19.2
89
+ jiter==0.8.2
90
+ joblib==1.4.2
91
+ jsonpatch==1.33
92
+ jsonpointer==3.0.0
93
+ jsonschema-specifications==2024.10.1
94
+ jsonschema==4.23.0
95
+ kiwisolver==1.4.8
96
+ langchain-community==0.3.14
97
+ langchain-core==0.3.29
98
+ langchain-text-splitters==0.3.5
99
+ langchain==0.3.14
100
+ langsmith==0.2.10
101
+ lazy_loader==0.4
102
+ libvisualwebarena==0.0.15
103
+ libwebarena==0.0.4
104
+ linkify-it-py==2.0.3
105
+ locket==1.0.0
106
+ lxml==5.3.0
107
+ markdown-it-py==3.0.0
108
+ marshmallow==3.25.1
109
+ matplotlib-inline==0.1.7
110
+ matplotlib==3.10.0
111
+ mdit-py-plugins==0.4.2
112
+ mdurl==0.1.2
113
+ memray==1.15.0
114
+ mpmath==1.3.0
115
+ msgpack==1.1.0
116
+ multidict==6.1.0
117
+ multiprocess==0.70.16
118
+ mypy-extensions==1.0.0
119
+ networkx==3.4.2
120
+ nltk==3.9.1
121
+ nodeenv==1.9.1
122
+ numpy==1.26.4
123
+ openai==1.59.7
124
+ opencensus-context==0.1.3
125
+ opencensus==0.11.4
126
+ orjson==3.10.14
127
+ packaging==24.2
128
+ pandas==2.2.3
129
+ parso==0.8.4
130
+ partd==1.4.2
131
+ pathspec==0.12.1
132
+ pexpect==4.9.0
133
+ pillow==11.1.0
134
+ pip==24.3.1
135
+ platformdirs==4.3.6
136
+ playwright==1.49.1
137
+ pluggy==1.5.0
138
+ portalocker==3.1.1
139
+ pre_commit==4.0.1
140
+ prometheus_client==0.21.1
141
+ prompt_toolkit==3.0.48
142
+ propcache==0.2.1
143
+ proto-plus==1.25.0
144
+ protobuf==5.29.3
145
+ psutil==6.1.0
146
+ psutil==6.1.1
147
+ ptyprocess==0.7.0
148
+ pure_eval==0.2.3
149
+ py-spy==0.4.0
150
+ pyarrow==19.0.0
151
+ pyasn1==0.6.1
152
+ pyasn1_modules==0.4.1
153
+ pydantic-settings==2.7.1
154
+ pydantic==2.10.5
155
+ pydantic_core==2.27.2
156
+ pydub==0.25.1
157
+ pyee==12.0.0
158
+ pyparsing==3.2.1
159
+ pytest-base-url==2.1.0
160
+ pytest-playwright==0.6.2
161
+ pytest-xdist==3.6.1
162
+ pytest==8.3.4
163
+ python-dateutil==2.9.0.post0
164
+ python-dotenv==1.0.1
165
+ python-multipart==0.0.20
166
+ python-slugify==8.0.4
167
+ pytz==2024.2
168
+ ray==2.40.0
169
+ referencing==0.35.1
170
+ regex==2024.11.6
171
+ requests-toolbelt==1.0.0
172
+ requests==2.32.3
173
+ rich==13.9.4
174
+ rpds-py==0.22.3
175
+ rsa==4.9
176
+ ruff==0.9.2
177
+ sacrebleu==2.5.1
178
+ safehttpx==0.1.6
179
+ safetensors==0.5.2
180
+ scikit-image==0.25.0
181
+ scipy==1.15.1
182
+ semantic-version==2.10.0
183
+ setproctitle==1.2.2
184
+ setuptools==75.8.0
185
+ shellingham==1.5.4
186
+ six==1.17.0
187
+ smart-open==7.1.0
188
+ smmap==5.0.2
189
+ sniffio==1.3.1
190
+ sortedcontainers==2.4.0
191
+ soupsieve==2.6
192
+ stack-data==0.6.3
193
+ starlette==0.41.3
194
+ sympy==1.13.3
195
+ tabulate==0.9.0
196
+ tblib==3.0.0
197
+ tenacity==9.0.0
198
+ text-generation==0.7.0
199
+ text-unidecode==1.3
200
+ textual==1.0.0
201
+ tifffile==2025.1.10
202
+ tiktoken==0.8.0
203
+ tokenize_rt==6.1.0
204
+ tokenizers==0.21.0
205
+ tomlkit==0.13.2
206
+ toolz==1.0.0
207
+ torch==2.2.2
208
+ tornado==6.4.2
209
+ tqdm==4.67.1
210
+ traitlets==5.14.3
211
+ transformers==4.48.0
212
+ typer==0.15.1
213
+ types-requests==2.32.0.20241016
214
+ types-tqdm==4.67.0.20241221
215
+ typing-inspect==0.9.0
216
+ typing_extensions==4.12.2
217
+ tzdata==2024.2
218
+ uc-micro-py==1.0.3
219
+ urllib3==2.3.0
220
+ uvicorn==0.34.0
221
+ virtualenv==20.29.0
222
+ wcwidth==0.2.13
223
+ weblinx-browsergym==0.0.1.dev14
224
+ weblinx==0.3.2
225
+ websockets==14.1
226
+ wheel==0.45.1
227
+ wrapt==1.17.2
228
+ xxhash==3.5.0
229
+ yarl==1.18.3
230
+ zict==3.0.0
231
+ zipp==3.21.0
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/screenshot_step_0.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/screenshot_step_1.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/step_0.pkl.gz ADDED
Binary file (8.11 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/step_1.pkl.gz ADDED
Binary file (4.76 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14/summary_info.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "n_steps": 1,
3
+ "cum_reward": 1.0,
4
+ "cum_raw_reward": 0,
5
+ "err_msg": null,
6
+ "stack_trace": null,
7
+ "stats.cum_steps": 2,
8
+ "stats.cum_n_token_goal": 10,
9
+ "stats.max_n_token_goal": 10,
10
+ "stats.cum_n_token_url": 30,
11
+ "stats.max_n_token_url": 30,
12
+ "stats.cum_n_token_focused_element_bid": 1,
13
+ "stats.max_n_token_focused_element_bid": 1,
14
+ "stats.cum_n_token_last_action": 0,
15
+ "stats.max_n_token_last_action": 0,
16
+ "stats.cum_n_token_last_action_error": 0,
17
+ "stats.max_n_token_last_action_error": 0,
18
+ "stats.cum_n_token_dom_txt": 1257,
19
+ "stats.max_n_token_dom_txt": 1257,
20
+ "stats.cum_n_token_axtree_txt": 108,
21
+ "stats.max_n_token_axtree_txt": 108,
22
+ "stats.cum_n_token_pruned_html": 658,
23
+ "stats.max_n_token_pruned_html": 658,
24
+ "stats.cum_n_retry_llm": 1,
25
+ "stats.max_n_retry_llm": 1,
26
+ "stats.cum_n_retry": 0.0,
27
+ "stats.max_n_retry": 0.0,
28
+ "stats.cum_busted_retry": 0,
29
+ "stats.max_busted_retry": 0,
30
+ "stats.cum_input_tokens": 1643,
31
+ "stats.max_input_tokens": 1643,
32
+ "stats.cum_output_tokens": 56,
33
+ "stats.max_output_tokens": 56,
34
+ "stats.cum_cost": 0.00028005,
35
+ "stats.max_cost": 0.00028005,
36
+ "stats.cum_n_token_agent_messages": 1697,
37
+ "stats.max_n_token_agent_messages": 1697,
38
+ "stats.cum_step_elapsed": 4.1717939376831055,
39
+ "stats.max_step_elapsed": 4.1717939376831055,
40
+ "stats.cum_agent_elapsed": 2.4256081581115723,
41
+ "stats.max_agent_elapsed": 2.4256081581115723,
42
+ "terminated": true,
43
+ "truncated": false
44
+ }
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/exp_args.pkl ADDED
Binary file (2.3 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/experiment.log ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-01-21 11:42:54,795 - 93647 - browsergym.experiments.loop - INFO - Running experiment GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28 in:
2
+ /Users/t.lesellierdechezell/agentlab_results/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28
3
+ 2025-01-21 11:42:54,876 - 93647 - browsergym.experiments.loop - DEBUG - Agent created.
4
+ 2025-01-21 11:42:54,883 - 93647 - browsergym.experiments.loop - DEBUG - Environment created.
5
+ 2025-01-21 11:42:54,883 - 93647 - asyncio - DEBUG - Using selector: KqueueSelector
6
+ 2025-01-21 11:42:58,091 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='about:blank'>
7
+ 2025-01-21 11:42:58,128 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
8
+ 2025-01-21 11:42:58,228 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
9
+ 2025-01-21 11:42:58,247 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
10
+ 2025-01-21 11:42:58,381 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
11
+ 2025-01-21 11:42:58,385 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
12
+ 2025-01-21 11:42:58,528 - 93647 - browsergym.core.observation - DEBUG - Marking frame ''
13
+ 2025-01-21 11:42:59,050 - 93647 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
14
+ 2025-01-21 11:42:59,050 - 93647 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
15
+ 2025-01-21 11:42:59,050 - 93647 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 3713
16
+ 2025-01-21 11:42:59,066 - 93647 - browsergym.experiments.loop - DEBUG - Environment reset.
17
+ 2025-01-21 11:42:59,066 - 93647 - browsergym.experiments.loop - DEBUG - Starting step 0.
18
+ 2025-01-21 11:42:59,855 - 93647 - httpcore.connection - DEBUG - connect_tcp.started host='ui-assist.openai.azure.com' port=443 local_address=None timeout=5.0 socket_options=None
19
+ 2025-01-21 11:43:00,130 - 93647 - httpcore.connection - DEBUG - connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x1a91bbb10>
20
+ 2025-01-21 11:43:00,130 - 93647 - httpcore.connection - DEBUG - start_tls.started ssl_context=<ssl.SSLContext object at 0x1a8d547a0> server_hostname='ui-assist.openai.azure.com' timeout=5.0
21
+ 2025-01-21 11:43:00,658 - 93647 - httpcore.connection - DEBUG - start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x1a82adc90>
22
+ 2025-01-21 11:43:00,658 - 93647 - httpcore.http11 - DEBUG - send_request_headers.started request=<Request [b'POST']>
23
+ 2025-01-21 11:43:00,659 - 93647 - httpcore.http11 - DEBUG - send_request_headers.complete
24
+ 2025-01-21 11:43:00,659 - 93647 - httpcore.http11 - DEBUG - send_request_body.started request=<Request [b'POST']>
25
+ 2025-01-21 11:43:00,659 - 93647 - httpcore.http11 - DEBUG - send_request_body.complete
26
+ 2025-01-21 11:43:00,659 - 93647 - httpcore.http11 - DEBUG - receive_response_headers.started request=<Request [b'POST']>
27
+ 2025-01-21 11:43:01,301 - 93647 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Length', b'1294'), (b'Content-Type', b'application/json'), (b'apim-request-id', b'4bfc921c-a044-45a5-8ca1-3bcf902a2973'), (b'Strict-Transport-Security', b'max-age=31536000; includeSubDomains; preload'), (b'x-content-type-options', b'nosniff'), (b'x-accel-buffering', b'no'), (b'x-ms-rai-invoked', b'true'), (b'x-request-id', b'f2a85b35-9ba3-4de1-b77b-d7f3c429344a'), (b'x-ms-region', b'East US'), (b'x-ratelimit-remaining-requests', b'118369'), (b'x-ratelimit-remaining-tokens', b'11819059'), (b'azureml-model-session', b'd032-20241226115614'), (b'x-envoy-upstream-service-time', b'534'), (b'x-ms-client-request-id', b'4bfc921c-a044-45a5-8ca1-3bcf902a2973'), (b'Date', b'Tue, 21 Jan 2025 16:43:00 GMT')])
28
+ 2025-01-21 11:43:01,303 - 93647 - httpx - INFO - HTTP Request: POST https://ui-assist.openai.azure.com/openai/deployments/gpt-4o-mini-2024-07-18/chat/completions?api-version=2024-02-01 "HTTP/1.1 200 OK"
29
+ 2025-01-21 11:43:01,303 - 93647 - httpcore.http11 - DEBUG - receive_response_body.started request=<Request [b'POST']>
30
+ 2025-01-21 11:43:01,303 - 93647 - httpcore.http11 - DEBUG - receive_response_body.complete
31
+ 2025-01-21 11:43:01,303 - 93647 - httpcore.http11 - DEBUG - response_closed.started
32
+ 2025-01-21 11:43:01,303 - 93647 - httpcore.http11 - DEBUG - response_closed.complete
33
+ 2025-01-21 11:43:01,490 - 93647 - browsergym.experiments.loop - DEBUG - Agent chose action:
34
+ click('20')
35
+ 2025-01-21 11:43:01,497 - 93647 - browsergym.experiments.loop - DEBUG - Step info saved.
36
+ 2025-01-21 11:43:01,497 - 93647 - browsergym.experiments.loop - INFO - The dialog box is currently open, and the "Close" button is visible and focused. I will click the "Close" button to close the dialog box.
37
+
38
+ action:
39
+ click('20')
40
+
41
+ 2025-01-21 11:43:01,501 - 93647 - browsergym.experiments.loop - DEBUG - Chat info sent.
42
+ 2025-01-21 11:43:01,501 - 93647 - browsergym.experiments.loop - DEBUG - Sending action to environment.
43
+ 2025-01-21 11:43:01,501 - 93647 - browsergym.core.env - DEBUG - Executing action
44
+ 2025-01-21 11:43:01,580 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
45
+ 2025-01-21 11:43:01,583 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
46
+ 2025-01-21 11:43:01,585 - 93647 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='file:///Users/t.lesellierdechezell/miniwob-plusplus/miniwob/html/miniwob/click-dialog.html'>
47
+ 2025-01-21 11:43:01,688 - 93647 - browsergym.core.env - DEBUG - Action executed
48
+ 2025-01-21 11:43:02,205 - 93647 - browsergym.core.env - DEBUG - Active page checked
49
+ 2025-01-21 11:43:02,205 - 93647 - browsergym.core.env - DEBUG - User message done
50
+ 2025-01-21 11:43:02,205 - 93647 - browsergym.core.env - DEBUG - Initiating task validation
51
+ 2025-01-21 11:43:02,209 - 93647 - browsergym.core.env - DEBUG - Task validation done
52
+ 2025-01-21 11:43:02,209 - 93647 - browsergym.core.observation - DEBUG - Marking frame ''
53
+ 2025-01-21 11:43:02,356 - 93647 - PIL.PngImagePlugin - DEBUG - STREAM b'IHDR' 16 13
54
+ 2025-01-21 11:43:02,356 - 93647 - PIL.PngImagePlugin - DEBUG - STREAM b'sRGB' 41 1
55
+ 2025-01-21 11:43:02,356 - 93647 - PIL.PngImagePlugin - DEBUG - STREAM b'IDAT' 54 1173
56
+ 2025-01-21 11:43:02,359 - 93647 - browsergym.core.env - DEBUG - Observation extracted
57
+ 2025-01-21 11:43:02,366 - 93647 - browsergym.experiments.loop - DEBUG - Environment stepped.
58
+ 2025-01-21 11:43:02,370 - 93647 - browsergym.experiments.loop - INFO - Saving summary info.
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/goal_object.pkl.gz ADDED
Binary file (105 Bytes). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/package_versions.txt ADDED
@@ -0,0 +1,231 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Faker==33.3.1
2
+ Farama-Notifications==0.0.4
3
+ Flask==3.1.0
4
+ GitPython==3.1.44
5
+ Jinja2==3.1.5
6
+ MarkupSafe==2.1.5
7
+ PyYAML==6.0.2
8
+ Pygments==2.19.1
9
+ SQLAlchemy==2.0.37
10
+ Werkzeug==3.1.3
11
+ agentlab==0.3.2
12
+ agentlab==0.3.2
13
+ aiofiles==23.2.1
14
+ aiohappyeyeballs==2.4.4
15
+ aiohttp-cors==0.7.0
16
+ aiohttp==3.11.11
17
+ aiolimiter==1.2.1
18
+ aiosignal==1.3.2
19
+ annotated-types==0.7.0
20
+ anyio==4.8.0
21
+ asttokens==3.0.0
22
+ attrs==24.3.0
23
+ beartype==0.12.0
24
+ beautifulsoup4==4.12.3
25
+ black==24.10.0
26
+ blacken-docs==1.19.1
27
+ blinker==1.9.0
28
+ browsergym-assistantbench==0.13.3
29
+ browsergym-core==0.13.3
30
+ browsergym-experiments==0.13.3
31
+ browsergym-miniwob==0.13.3
32
+ browsergym-visualwebarena==0.13.3
33
+ browsergym-webarena==0.13.3
34
+ browsergym-workarena==0.4.1
35
+ browsergym==0.13.3
36
+ cachetools==5.5.0
37
+ certifi==2024.12.14
38
+ cfgv==3.4.0
39
+ charset-normalizer==3.4.1
40
+ click==8.1.8
41
+ cloudpickle==3.1.1
42
+ colorama==0.4.6
43
+ colorama==0.4.6
44
+ colorful==0.5.6
45
+ contexttimer==0.3.3
46
+ contourpy==1.3.1
47
+ cycler==0.12.1
48
+ dask==2024.12.1
49
+ dataclasses-json==0.6.7
50
+ datasets==3.2.0
51
+ decorator==5.1.1
52
+ dill==0.3.8
53
+ distlib==0.3.9
54
+ distributed==2024.12.1
55
+ distro==1.9.0
56
+ english-words==2.0.1
57
+ evaluate==0.4.3
58
+ execnet==2.1.1
59
+ executing==2.1.0
60
+ fastapi==0.115.6
61
+ ffmpy==0.5.0
62
+ filelock==3.16.1
63
+ flaky==3.8.1
64
+ fonttools==4.55.3
65
+ frozenlist==1.5.0
66
+ fsspec==2024.9.0
67
+ gitdb==4.0.12
68
+ google-api-core==2.24.0
69
+ google-auth==2.37.0
70
+ googleapis-common-protos==1.66.0
71
+ gradio==5.12.0
72
+ gradio_client==1.5.4
73
+ greenlet==3.1.1
74
+ grpcio==1.69.0
75
+ gymnasium==1.0.0
76
+ h11==0.14.0
77
+ httpcore==1.0.7
78
+ httpx-sse==0.4.0
79
+ httpx==0.28.1
80
+ huggingface-hub==0.27.1
81
+ identify==2.6.5
82
+ idna==3.10
83
+ imageio==2.36.1
84
+ importlib_metadata==8.5.0
85
+ iniconfig==2.0.0
86
+ ipython==8.31.0
87
+ itsdangerous==2.2.0
88
+ jedi==0.19.2
89
+ jiter==0.8.2
90
+ joblib==1.4.2
91
+ jsonpatch==1.33
92
+ jsonpointer==3.0.0
93
+ jsonschema-specifications==2024.10.1
94
+ jsonschema==4.23.0
95
+ kiwisolver==1.4.8
96
+ langchain-community==0.3.14
97
+ langchain-core==0.3.29
98
+ langchain-text-splitters==0.3.5
99
+ langchain==0.3.14
100
+ langsmith==0.2.10
101
+ lazy_loader==0.4
102
+ libvisualwebarena==0.0.15
103
+ libwebarena==0.0.4
104
+ linkify-it-py==2.0.3
105
+ locket==1.0.0
106
+ lxml==5.3.0
107
+ markdown-it-py==3.0.0
108
+ marshmallow==3.25.1
109
+ matplotlib-inline==0.1.7
110
+ matplotlib==3.10.0
111
+ mdit-py-plugins==0.4.2
112
+ mdurl==0.1.2
113
+ memray==1.15.0
114
+ mpmath==1.3.0
115
+ msgpack==1.1.0
116
+ multidict==6.1.0
117
+ multiprocess==0.70.16
118
+ mypy-extensions==1.0.0
119
+ networkx==3.4.2
120
+ nltk==3.9.1
121
+ nodeenv==1.9.1
122
+ numpy==1.26.4
123
+ openai==1.59.7
124
+ opencensus-context==0.1.3
125
+ opencensus==0.11.4
126
+ orjson==3.10.14
127
+ packaging==24.2
128
+ pandas==2.2.3
129
+ parso==0.8.4
130
+ partd==1.4.2
131
+ pathspec==0.12.1
132
+ pexpect==4.9.0
133
+ pillow==11.1.0
134
+ pip==24.3.1
135
+ platformdirs==4.3.6
136
+ playwright==1.49.1
137
+ pluggy==1.5.0
138
+ portalocker==3.1.1
139
+ pre_commit==4.0.1
140
+ prometheus_client==0.21.1
141
+ prompt_toolkit==3.0.48
142
+ propcache==0.2.1
143
+ proto-plus==1.25.0
144
+ protobuf==5.29.3
145
+ psutil==6.1.0
146
+ psutil==6.1.1
147
+ ptyprocess==0.7.0
148
+ pure_eval==0.2.3
149
+ py-spy==0.4.0
150
+ pyarrow==19.0.0
151
+ pyasn1==0.6.1
152
+ pyasn1_modules==0.4.1
153
+ pydantic-settings==2.7.1
154
+ pydantic==2.10.5
155
+ pydantic_core==2.27.2
156
+ pydub==0.25.1
157
+ pyee==12.0.0
158
+ pyparsing==3.2.1
159
+ pytest-base-url==2.1.0
160
+ pytest-playwright==0.6.2
161
+ pytest-xdist==3.6.1
162
+ pytest==8.3.4
163
+ python-dateutil==2.9.0.post0
164
+ python-dotenv==1.0.1
165
+ python-multipart==0.0.20
166
+ python-slugify==8.0.4
167
+ pytz==2024.2
168
+ ray==2.40.0
169
+ referencing==0.35.1
170
+ regex==2024.11.6
171
+ requests-toolbelt==1.0.0
172
+ requests==2.32.3
173
+ rich==13.9.4
174
+ rpds-py==0.22.3
175
+ rsa==4.9
176
+ ruff==0.9.2
177
+ sacrebleu==2.5.1
178
+ safehttpx==0.1.6
179
+ safetensors==0.5.2
180
+ scikit-image==0.25.0
181
+ scipy==1.15.1
182
+ semantic-version==2.10.0
183
+ setproctitle==1.2.2
184
+ setuptools==75.8.0
185
+ shellingham==1.5.4
186
+ six==1.17.0
187
+ smart-open==7.1.0
188
+ smmap==5.0.2
189
+ sniffio==1.3.1
190
+ sortedcontainers==2.4.0
191
+ soupsieve==2.6
192
+ stack-data==0.6.3
193
+ starlette==0.41.3
194
+ sympy==1.13.3
195
+ tabulate==0.9.0
196
+ tblib==3.0.0
197
+ tenacity==9.0.0
198
+ text-generation==0.7.0
199
+ text-unidecode==1.3
200
+ textual==1.0.0
201
+ tifffile==2025.1.10
202
+ tiktoken==0.8.0
203
+ tokenize_rt==6.1.0
204
+ tokenizers==0.21.0
205
+ tomlkit==0.13.2
206
+ toolz==1.0.0
207
+ torch==2.2.2
208
+ tornado==6.4.2
209
+ tqdm==4.67.1
210
+ traitlets==5.14.3
211
+ transformers==4.48.0
212
+ typer==0.15.1
213
+ types-requests==2.32.0.20241016
214
+ types-tqdm==4.67.0.20241221
215
+ typing-inspect==0.9.0
216
+ typing_extensions==4.12.2
217
+ tzdata==2024.2
218
+ uc-micro-py==1.0.3
219
+ urllib3==2.3.0
220
+ uvicorn==0.34.0
221
+ virtualenv==20.29.0
222
+ wcwidth==0.2.13
223
+ weblinx-browsergym==0.0.1.dev14
224
+ weblinx==0.3.2
225
+ websockets==14.1
226
+ wheel==0.45.1
227
+ wrapt==1.17.2
228
+ xxhash==3.5.0
229
+ yarl==1.18.3
230
+ zict==3.0.0
231
+ zipp==3.21.0
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/screenshot_step_0.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/screenshot_step_1.png ADDED
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/step_0.pkl.gz ADDED
Binary file (8.1 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/step_1.pkl.gz ADDED
Binary file (4.75 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28/summary_info.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "n_steps": 1,
3
+ "cum_reward": 1.0,
4
+ "cum_raw_reward": 0,
5
+ "err_msg": null,
6
+ "stack_trace": null,
7
+ "stats.cum_steps": 2,
8
+ "stats.cum_n_token_goal": 10,
9
+ "stats.max_n_token_goal": 10,
10
+ "stats.cum_n_token_url": 30,
11
+ "stats.max_n_token_url": 30,
12
+ "stats.cum_n_token_focused_element_bid": 1,
13
+ "stats.max_n_token_focused_element_bid": 1,
14
+ "stats.cum_n_token_last_action": 0,
15
+ "stats.max_n_token_last_action": 0,
16
+ "stats.cum_n_token_last_action_error": 0,
17
+ "stats.max_n_token_last_action_error": 0,
18
+ "stats.cum_n_token_dom_txt": 1250,
19
+ "stats.max_n_token_dom_txt": 1250,
20
+ "stats.cum_n_token_axtree_txt": 104,
21
+ "stats.max_n_token_axtree_txt": 104,
22
+ "stats.cum_n_token_pruned_html": 651,
23
+ "stats.max_n_token_pruned_html": 651,
24
+ "stats.cum_n_retry_llm": 1,
25
+ "stats.max_n_retry_llm": 1,
26
+ "stats.cum_n_retry": 0.0,
27
+ "stats.max_n_retry": 0.0,
28
+ "stats.cum_busted_retry": 0,
29
+ "stats.max_busted_retry": 0,
30
+ "stats.cum_input_tokens": 1638,
31
+ "stats.max_input_tokens": 1638,
32
+ "stats.cum_output_tokens": 48,
33
+ "stats.max_output_tokens": 48,
34
+ "stats.cum_cost": 0.0002745,
35
+ "stats.max_cost": 0.0002745,
36
+ "stats.cum_n_token_agent_messages": 1678,
37
+ "stats.max_n_token_agent_messages": 1678,
38
+ "stats.cum_step_elapsed": 4.17129111289978,
39
+ "stats.max_step_elapsed": 4.17129111289978,
40
+ "stats.cum_agent_elapsed": 2.246234178543091,
41
+ "stats.max_agent_elapsed": 2.246234178543091,
42
+ "terminated": true,
43
+ "truncated": false
44
+ }
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/error_report_trial_1_of_3.md ADDED
File without changes
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/result_df_trial_1_of_3.csv ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ env.task_name,agent.agent_name,env.benchmark,index,exp_dir,agent.chat_model.model_name,agent.chat_model.max_total_tokens,agent.chat_model.max_input_tokens,agent.chat_model.max_new_tokens,agent.chat_model.temperature,agent.chat_model.vision_support,agent.chat_model.deployment_name,agent.flags.obs.use_html,agent.flags.obs.use_ax_tree,agent.flags.obs.use_tabs,agent.flags.obs.use_focused_element,agent.flags.obs.use_error_logs,agent.flags.obs.use_history,agent.flags.obs.use_past_error_logs,agent.flags.obs.use_action_history,agent.flags.obs.use_think_history,agent.flags.obs.use_diff,agent.flags.obs.html_type,agent.flags.obs.use_screenshot,agent.flags.obs.use_som,agent.flags.obs.extract_visible_tag,agent.flags.obs.extract_clickable_tag,agent.flags.obs.extract_coords,agent.flags.obs.filter_visible_elements_only,agent.flags.obs.openai_vision_detail,agent.flags.obs.filter_with_bid_only,agent.flags.obs.filter_som_only,agent.flags.action.action_set.subsets,agent.flags.action.action_set.multiaction,agent.flags.action.action_set.strict,agent.flags.action.action_set.retry_with_force,agent.flags.action.action_set.demo_mode,agent.flags.action.long_description,agent.flags.action.individual_examples,agent.flags.action.multi_actions,agent.flags.action.is_strict,agent.flags.use_plan,agent.flags.use_criticise,agent.flags.use_thinking,agent.flags.use_memory,agent.flags.use_concrete_example,agent.flags.use_abstract_example,agent.flags.use_hints,agent.flags.enable_chat,agent.flags.max_prompt_tokens,agent.flags.be_cautious,agent.flags.extra_instructions,agent.flags.add_missparsed_messages,agent.flags.max_trunc_itr,agent.flags.flag_group,agent.max_retry,env.task_seed,env.max_steps,env.headless,env.record_video,env.wait_for_user_message,env.viewport,env.slow_mo,env.storage_state,env.task_kwargs,exp_name,enable_debug,err_msg,stack_trace,order,logging_level,logging_level_stdout,exp_id,depends_on,save_screenshot,save_som,n_steps,cum_reward,cum_raw_reward,stats.cum_steps,stats.cum_n_token_goal,stats.max_n_token_goal,stats.cum_n_token_url,stats.max_n_token_url,stats.cum_n_token_focused_element_bid,stats.max_n_token_focused_element_bid,stats.cum_n_token_last_action,stats.max_n_token_last_action,stats.cum_n_token_last_action_error,stats.max_n_token_last_action_error,stats.cum_n_token_dom_txt,stats.max_n_token_dom_txt,stats.cum_n_token_axtree_txt,stats.max_n_token_axtree_txt,stats.cum_n_token_pruned_html,stats.max_n_token_pruned_html,stats.cum_n_retry_llm,stats.max_n_retry_llm,stats.cum_n_retry,stats.max_n_retry,stats.cum_busted_retry,stats.max_busted_retry,stats.cum_input_tokens,stats.max_input_tokens,stats.cum_output_tokens,stats.max_output_tokens,stats.cum_cost,stats.max_cost,stats.cum_n_token_agent_messages,stats.max_n_token_agent_messages,stats.cum_step_elapsed,stats.max_step_elapsed,stats.cum_agent_elapsed,stats.max_agent_elapsed,terminated,truncated,err_key
2
+ miniwob.click-checkboxes,GenericAgent-gpt-4o-mini,miniwob,0,/Users/t.lesellierdechezell/agentlab_results/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7,gpt-4o-mini,128000,128000,16384,0.1,True,gpt-4o-mini-2024-07-18,True,True,False,True,True,True,False,True,False,False,pruned_html,False,False,True,True,False,False,auto,False,False,"('miniwob_all',)",False,False,True,off,False,False,,,False,False,True,False,True,True,True,False,40000,True,,True,20,,4,7,5,True,False,False,,,,,GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_7,True,,,2,10,30,8005be05-9300-454e-8a82-366111902732,(),True,False,2,1.0,0,3,12,6,62,31,2,1,4,4,0,0,1902,952,468,235,650,326,2,1,0.0,0.0,0,0,2889,1454,122,63,0.0005065499999999999,0.0002559,3002,1512,5.017662048339844,4.137955188751221,3.1547908782958984,2.342694044113159,True,False,
3
+ miniwob.click-checkboxes,GenericAgent-gpt-4o-mini,miniwob,2,/Users/t.lesellierdechezell/agentlab_results/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20,gpt-4o-mini,128000,128000,16384,0.1,True,gpt-4o-mini-2024-07-18,True,True,False,True,True,True,False,True,False,False,pruned_html,False,False,True,True,False,False,auto,False,False,"('miniwob_all',)",False,False,True,off,False,False,,,False,False,True,False,True,True,True,False,40000,True,,True,20,,4,20,5,True,False,False,,,,,GenericAgent-gpt-4o-mini_on_miniwob.click-checkboxes_20,True,,,3,10,30,f78dc433-ea2f-4d66-83c9-7a8f1d491bcc,(),True,False,3,1.0,0,4,27,9,93,31,3,1,8,4,0,0,2892,966,769,257,1014,340,3,1,0.0,0.0,0,0,4489,1514,224,83,0.0008077499999999999,0.0002715,4670,1573,5.8213889598846436,4.13917088508606,5.303440809249878,3.529852867126465,True,False,
4
+ miniwob.click-dialog,GenericAgent-gpt-4o-mini,miniwob,1,/Users/t.lesellierdechezell/agentlab_results/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14,gpt-4o-mini,128000,128000,16384,0.1,True,gpt-4o-mini-2024-07-18,True,True,False,True,True,True,False,True,False,False,pruned_html,False,False,True,True,False,False,auto,False,False,"('miniwob_all',)",False,False,True,off,False,False,,,False,False,True,False,True,True,True,False,40000,True,,True,20,,4,14,5,True,False,False,,,,,GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_14,True,,,1,10,30,ef0b660d-b695-4822-9f78-598fd29612b9,(),True,False,1,1.0,0,2,10,10,30,30,1,1,0,0,0,0,1257,1257,108,108,658,658,1,1,0.0,0.0,0,0,1643,1643,56,56,0.00028005,0.00028005,1697,1697,4.1717939376831055,4.1717939376831055,2.4256081581115723,2.4256081581115723,True,False,
5
+ miniwob.click-dialog,GenericAgent-gpt-4o-mini,miniwob,3,/Users/t.lesellierdechezell/agentlab_results/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/2025-01-21_11-42-43_GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28,gpt-4o-mini,128000,128000,16384,0.1,True,gpt-4o-mini-2024-07-18,True,True,False,True,True,True,False,True,False,False,pruned_html,False,False,True,True,False,False,auto,False,False,"('miniwob_all',)",False,False,True,off,False,False,,,False,False,True,False,True,True,True,False,40000,True,,True,20,,4,28,5,True,False,False,,,,,GenericAgent-gpt-4o-mini_on_miniwob.click-dialog_28,True,,,0,10,30,5ce97677-801e-44fc-9a2f-8475a4195d52,(),True,False,1,1.0,0,2,10,10,30,30,1,1,0,0,0,0,1250,1250,104,104,651,651,1,1,0.0,0.0,0,0,1638,1638,48,48,0.0002745,0.0002745,1678,1678,4.17129111289978,4.17129111289978,2.246234178543091,2.246234178543091,True,False,
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/study.pkl.gz ADDED
Binary file (3.8 kB). View file
 
data/2025-01-21_11-42-43_genericagent-gpt-4o-mini-on-miniwob-tiny-test/summary_df_trial_1_of_3.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ agent.agent_name,env.benchmark,avg_reward,std_err,avg_steps,n_completed,n_err,cum_cost
2
+ GenericAgent-gpt-4o-mini,miniwob,1.0,0.0,1.75,4/4,0,0.0019
requirements.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ agentlab