readme update
Browse files
README.md
CHANGED
@@ -4,6 +4,8 @@ It thinks like o1
|
|
4 |
|
5 |
## TODO
|
6 |
|
|
|
|
|
7 |
[ ] Add fallback llms
|
8 |
[ ] Better error handling
|
9 |
[ ] Add Tools (web, math, code)
|
@@ -27,6 +29,15 @@ streamlit run app.py
|
|
27 |
|
28 |
HAVE FUN.
|
29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
## Helpful Papers
|
31 |
|
32 |
1. To Cot or not to Cot? CHAIN-OF-THOUGHT HELPS MAINLY ON MATH AND SYMBOLIC REASONING
|
|
|
4 |
|
5 |
## TODO
|
6 |
|
7 |
+
Todo
|
8 |
+
|
9 |
[ ] Add fallback llms
|
10 |
[ ] Better error handling
|
11 |
[ ] Add Tools (web, math, code)
|
|
|
29 |
|
30 |
HAVE FUN.
|
31 |
|
32 |
+
## FIndings
|
33 |
+
|
34 |
+
Although this project tries to mimic openai's o1, many times it falls short in generating and better reflections on previous answers, and that i thinks comes from the lack of such kind of training data that used to train the models (other models before o1), these models are probably not trained for fixing mistakes with better reasoning.
|
35 |
+
|
36 |
+
for example: here a `cerebras/llama3.1-70b` models jumps back and forth between count of "r" s in Straberry as 2 and 3. Even when having second thoughts, It doesn't sticks to its reasoning and because of the model bias it generates wrong answers. may be prompting can solve, but training with such data would be better.
|
37 |
+
![wrong answer formation due to model bias](src/error-image.png)
|
38 |
+
|
39 |
+
|
40 |
+
|
41 |
## Helpful Papers
|
42 |
|
43 |
1. To Cot or not to Cot? CHAIN-OF-THOUGHT HELPS MAINLY ON MATH AND SYMBOLIC REASONING
|