Skip to main content

Helicone Tutorial

Helicone is an open source observability platform that proxies your OpenAI traffic and provides you key insights into your spend, latency and usage.

Use Helicone to log requests across all LLM Providers (OpenAI, Azure, Anthropic, Cohere, Replicate, PaLM)

liteLLM provides success_callbacks and failure_callbacks, making it easy for you to send data to a particular provider depending on the status of your responses.

In this case, we want to log requests to Helicone when a request succeeds.

Approach 1: Use Callbacks

Use just 1 line of code, to instantly log your responses across all providers with helicone:

litellm.success_callback=["helicone"]

Complete code

from litellm import completion

## set env variables
os.environ["HELICONE_API_KEY"] = "your-helicone-key"
os.environ["OPENAI_API_KEY"], os.environ["COHERE_API_KEY"] = "", ""

# set callbacks
litellm.success_callback=["helicone"]

#openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])

#cohere call
response = completion(model="command-nightly", messages=[{"role": "user", "content": "Hi 👋 - i'm cohere"}])

Approach 2: [OpenAI + Azure only] Use Helicone as a proxy

Helicone provides advanced functionality like caching, etc. Helicone currently supports this for Azure and OpenAI.

If you want to use Helicone to proxy your OpenAI/Azure requests, then you can -

  • Set helicone as your base url via: litellm.api_url
  • Pass in helicone request headers via: litellm.headers

Complete Code

import litellm
from litellm import completion

litellm.api_base = "https://oai.hconeai.com/v1"
litellm.headers = {"Helicone-Auth": f"Bearer {os.getenv('HELICONE_API_KEY')}"}

response = litellm.completion(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "how does a court case get to the Supreme Court?"}]
)

print(response)