Research Associate at The University of Manchester. I enjoy solving problems using computers and mathematics.
 
Published Mar 18, 2025
tldr; principles/strategies for using reasoning models (LLM with chain of thoughts), please click on the navigation to the left to hop about.
Another course by DeepLearning AI, Reasoning with o1. This course covers the new chain of thought models where the model can reason through a series of queries, a.k.a gives the model ‘time to think’, making it good for reasoning and planning. It will teach us the best task for this model and how to use it effectively, as well as how to use it in combination with other models.
Reminder that OpenAI cookbook exists, which is collection of their examples for specific topics.
Disclaimer: LLMs are not perfect, they suffer from ‘hallucination’ and make things up, so the output should still be scrutinized.
Previous LLMs were “children” answering questions with the first thing that comes to mind, but the new reasoning models think before the speak with the chain of thought framwork. o1 is a new reasoning model. Two version of reasoning available in ChatGPT/openAI, o1 and o1-mini. o1 is good for compelx task for broad general knowledge, o1-mini is good for coding/math/science.
Recommended use case for o1:
4 principles for prompting the o1 model:
o1 is great at planning, but given the latency and cost, it would be best to execute them with gpt4 model.
We can specify to the o1 that it will have access to gpt4 as agent and what functions it can execute. This is more for using the API to set up calling the LLMs and executing things. The example given was looking at stocks and deciding if it needed to order more things.
Simple and direct prompt is the best, here is an example:
"""Create an elegant, delightful React component for an Interview Feedback Form where:
1. The interviewer rates candidates across multiple dimensions using rubrics
2. Each rating must include specific evidence/examples
3. The final recommendation should auto-calculate based on a weighted scoring system
4. The UI should guide interviewers to give specific, behavioral feedback
The goal is to enforce structured, objective feedback gathering. A smart model should:
- Create a thoughtful rubric structure
- Add helpful prompts/placeholders
- Build in intelligent validation
Make sure to
- Call the element FeedbackForm
- Start the code with "use client"
Respond with the code only! Nothing else!"""
One thing we can also do is to run this iteratively, i.e. get o1 to generate code then get o1 to assess/grade the code.
The demo in the class basically how much better o1 than gpt4 is at interpreting images. The demo gave a organization chart and answer things bout the chart effectively. It does this by reading the image and extracting information in Json format, which is used to answer questions.
This is a concept where we can use LLMs to generate prompts for other LLMs, namely use o1 to generate prompt for gpt4. Idea here is that we have a set of evaluation criteria which is used to optimize the gpt4 prompt generated by o1.
The example seem to be more training an AI agent, such as a chatbot to deal with customer service.
Steps: