Fine-Tune GPT-3 model with provided training and validation dataset

In this tutorial, we will build our own fine-tuned GPT-3 model with provided training and validation dataset. It's doing text-summarization.

By Jason Chuang 2023-03-13

1. Getting OpenAI API key

Go to https://platform.openai.com/account/api-keys and log in. Generate new secret key and keep it safely.
OpenAI API key

2. Checking the prepared training and validation data. It should include a number of pairs of prompt and completion.

{"prompt": "Summarize the following text:Our new business production totaled $389 million of direct PVP exceeded by $75 million -- the direct PVP we produced in every year but once since 2010.\n....", "completion": "Turning to our fourth quarter 2020 results, adjusted operating income was $56 million or $0.69 per share compared with $87 million or $0.90 per share in the fourth quarter of 2019."}
{"prompt": "Summarize the following text:Overall, restaurant traffic has largely stabilized at about 5% below pre-pandemic levels led by the continued solid performance at quick service restaurants.\nDemand in U.S. retail channels also remained solid with overall category volumes in the quarter still up 15% to 20% from pre-pandemic levels....", "completion": "Specifically in the quarter, sales increased 13% to $984 million, with volume up 11% and price mix up 2%.\nDiluted earnings per share in the first quarter was $0.20, down from $0.61 in the prior year, while adjusted EBITDA including joint ventures was $123 million, down from $202 million."}
........................

3. Let's get started

First, we need to install the OpenAI library:

!pip install --upgrade openai
import os
import openai
os.environ["OPENAI_API_KEY"] = 'YOUR_API_KEY'

We will pass GPT-3 model as parameters. Such as ada, curie, or davinci

!openai api fine_tunes.create -t "prepared_train.jsonl" -v "prepared_val.jsonl" -m ada

It takes time to put fine-tune job in queue and train a model. Let's wait and check the status

!openai api fine_tunes.follow -i ft-3Wnb4hOrXU1FuQGDRfyvNWlz

After the fine-tuned model is created, wee can test our function. We will use the following prompt:

Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance.

Use command line

!openai api completions.create -m ada:ft-tpisoftware-2023-03-01-00-10-20 -p "Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance."

Or python code

openai.api_key = os.getenv("OPENAI_API_KEY")
openai.Completion.create(
  model="ada:ft-tpisoftware-2023-03-01-00-10-20",
  prompt="Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance.",
  max_tokens=256,
  temperature=0
)

Sample result:

<OpenAIObject text_completion id=cmpl-6p4x0eQuRB83JEWs19ssaSS5g7TKC at 0x1011ea4a0> JSON: {
  "choices": [
    {
      "finish_reason": "length",
      "index": 0,
      "logprobs": null,
      "text": "\nWe have been in the business of providing our customers with the best quality products and services for over 40 years."
    }
  ],
  "created": 1677631778,
  "id": "cmpl-6p4x0eQuRB83JEWs19ssaSS5g7TKC",
  "model": "ada:ft-tpisoftware-2023-03-01-00-10-20",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 256,
    "prompt_tokens": 17,
    "total_tokens": 273
  }
}

4. Reference

Github: https://github.com/chuangtc/ECTSum-GPT3

OpenAI: https://platform.openai.com/docs/guides/fine-tuning