API-based Models¶

Taking OpenAi’s api model as an example, use the code/run_chatgpt_eval.sh script file to run the evaluation. Except for the openai_key parameter, other parameters are consistent with the evaluation parameters based on model weight, and the core parameters are model_name and openai_key.

In the model_name field, please fill in the correct model name. If evaluating gpt4, please fill in the specifications. In the final evaluation script, please fill in gpt-4. For the correct model information, please refer to the OpenAI official website.

In addition, here openai_key recommends using the paid version, and the $5 version has a speed limit that affects the evaluation.

#!/bin/bash
export PROJ_HOME=$PWD
export KMP_DUPLICATE_LIB_OK=TRUE

# Determine the key of the api
openai_key=sk-***************** #Fill in your openai_key here for evaluation

exp_name=chatgpt
exp_date=$(date +"%Y%m%d%H%M%S")
output_path=$PROJ_HOME/output_dir/${exp_name}/$exp_date

echo "exp_date": $exp_date
echo "output_path": $output_path

python eval_chatgpt.py \
    --openai_key ${openai_key} \
    --cot False \
    --few_shot False \
    --n_times 1 \
    --ntrain 5 \
    --do_test False \
    --do_save_csv False \
    --output_dir ${output_path} \
    --model_name gpt-4 # Please fill in the correct OpenAI model name

Use FastChat to provide the API of the open source LLM model, which can support the OpenAI API interface formal model evaluation.