Log prompts and chains¶
You can log your prompts and chains to Comet using the open-source comet-llm
Python SDK.
The LLM SDK also allows you to log a user feedback score using the LLM API.
Install and configure the LLM SDK¶
To install and configure the LLM SDK, you can run the following:
- Install the SDK
pip install comet_llm
- Configure the SDK:
1 2 3
import comet_llm comet_llm.init()
You can find a full guide on how to configure the SDK here.
Note
If you are using OpenAI or LangChain, you don't need to log each prompt or chain manually as we have dedicated integrations with these frameworks. You can read more about the OpenAI integration here and the LangChain integration here.
Log prompts¶
The LLM SDK supports logging prompts with it's associated response as well as any associated metadata like token usage. This can be achieved through the function log_prompt
:
import comet_llm
comet_llm.log_prompt(
prompt="Answer the question and if the question can't be answered, say \"I don't know\"\n\n---\n\nQuestion: What is your name?\nAnswer:",
prompt_template="Answer the question and if the question can't be answered, say \"I don't know\"\n\n---\n\nQuestion: {{question}}?\nAnswer:",
prompt_template_variables={"question": "What is your name?"},
metadata= {
"usage.prompt_tokens": 7,
"usage.completion_tokens": 5,
"usage.total_tokens": 12,
},
output=" My name is Alex.",
duration=16.598,
)
Log chains¶
The LLM SDK supports logging a chain of executions that may include more than one LLM call, context retrieval, or data pre- or post-processing.
First start a chain with its inputs:
import comet_llm
comet_llm.start_chain({"user_question": user_question})
For each step in the chain, you can create a Span object. The Span object keeps track of the input, outputs, and duration of the step. You can have as many Spans as needed, and they can be nested within each other. Here is very simple example:
with comet_llm.Span(
category="YOUR-SPAN-CATEGORY", # You can use any string here
inputs=INPUTS, # You can pass any object in that dict as long as it can be dumped in JSON
) as span:
YOUR_CODE_HERE
span.set_outputs(outputs=OUTPUTS) # You can pass any object in that dict as long as it can be dumped in JSON
Here is a more realistic example of Spans including nesting of them:
def retrieve_context(user_question):
# Retrieve the context
with comet_llm.Span(
category="context-retrieval",
name="Retrieve Context",
inputs={"user_question": user_question},
) as context_span:
context = get_context(user_question)
context_span.set_outputs(outputs={"context": context})
return context
def llm_call(user_question, context):
prompt_template = """You are a helpful chatbot. You have access to the following context:
{context}
Analyze the following user question and decide if you can answer it, if the question can't be answered, say \"I don't know\":
{user_question}
"""
prompt = prompt_template.format(user_question=user_question, context=context)
with comet_llm.Span(
category="llm-call",
inputs={"prompt_template": prompt_template, "prompt": prompt},
) as llm_span:
# Call your LLM model here
result = "Yes we are currently open"
usage = {"prompt_tokens": 52, "completion_tokens": 12, "total_tokens": 64}
llm_span.set_outputs(outputs={"result": result}, metadata={"usage": usage})
return result
with comet_llm.Span(
category="llm-reasoning",
inputs={
"user_question": user_question,
},
) as span:
context = retrieve_context(user_question)
result = llm_call(user_question, context)
span.set_outputs(outputs={"result": result})
Then finally end your chain and upload it with:
comet_llm.end_chain(outputs={"result": result})
For more information, refer to the following references:
Log user feedback score¶
There are two ways to log a user feedback score to Comet:
- Using the
log_user_feedback
method - By logging your own metadata attribute
The benefit of using the log_user_feedback
method is that Comet will display the average of this score when you group prompts. In addition, you can update this score in the Prompt Table using the thumbs up / thumbs down feature.
However since Comet only supports 0
and 1
as valid feedback scores, in some scenarios you might want to log the feedback score as a metadata attribute.
Log user feedback score with log_user_feedback¶
In order to use the log_user_feedback
method, you will first need to retrieve the prompt and then log the score:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Log user feedback score as metadata¶
You can log a user feedback score as metadata either when you log the prompt or at a later date. In the example below we will log the score after the prompt has been logged:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|