2024-10-23 17:25
Here is an interesting new paper on improving Chain-of-Thought accuracy.
They find that adding correct and incorrect reasoning paths in demonstrations improves the accuracy of intermediate steps and CoT.
I am not surprised by this because we often see that when we provide feedback (e.g., solution hints or point out mistakes) to LLMs, they tend to generate effective results. Like humans, LLMs can also "learn" from failures. I have also seen something similar for RAG and even agentic systems.