Tune Chat Output Create Alternate Output Formats
Cheat sheet:
Alter the randomness and novelty of the output text by tuning it. # Create Subrip subtitles with openai.Audio.transcribe(response_format="srt")
transcript = openai.Audio.transcribe(..., response_format="srt")
# Control randomness with temperature (default is 1)
# Create Video Text Track subtitles with openai.Audio.transcribe(response_format="vtt")
The OpenAI API in Python
# temperature=0 gives highly deterministic output
# temperature=2 gives highly random output
transcript = openai.Audio.transcribe(..., response_format="vtt")
response = openai.ChatCompletion.create(mdl, mssgs, temperature=0.5)
# Get metadata with openai.Audio.transcribe(response_format="verbose_json")
# Control randomness using nucleus sampling with top_p (default is 1)
response = openai.Audio.transcribe(..., response_format="verbose_json")
# top_p = 0 gives highly deterministic output
transcript = pd.json_normalize(response)
# top_p = 1 gives highly random output
response = openai.ChatCompletion.create(mdl, mssgs, top_p=0.5)
Translate Audio to English
# Control talking about new topics using presence_penalty (default is 0)
# Transcribe the file & translate to English with openai.Audio.translate()
Learn AI online at www.DataCamp.com
# presence_penalty=-2 gives more repetition in conversations
with open("audio.mp3", "rb") as audio_file:
# presence_penalty=2 gives more novelty in conversations
transcript = openai.Audio.translate(
# frequency_penalty behaves similarly, but counts number of instances
file = audio_file,
# of previous tokens rather than detecting their presence
model = "whisper-1",
response = openai.ChatCompletion.create(mdl, mssgs, presence_penalty=1)
response_format="text"
)
# Limit output length with max_tokens
> Setup response = openai.ChatCompletion.create(mdl, mssgs, max_tokens=500)
To get started, you need to > Create Images with DALL-E
Create an OpenAI Developer accoun
Add a payment method to your OpenAI developer accoun
> Find Similar Text with Embeddings DALL-E can be used to generate images from text.
Retrieve your secret key and store it as an environment variable
Basic Flow for Image Generation
GPT models can be used for converting text to a numeric array that represents its meaning (embedding it), in order to find
similar text. # Utilities for PNG image display
We recommend using a platform like DataCamp Workspace that allows secure storage of your API secret key.
from PIL import Image
You'll need to load the os package to access your secret key, the openai package to access the API, pandas to make some Basic Flow for Embeddings from io import BytesIO
JSON output easier to work with, and some functions from IPython.display to render markdown output.
# Embed a line of text
# Generate images with openai.Image.create()
response = openai.Embedding.create(
response = openai.Image.create(
# Import the necessary packages
model="text-embedding-ada-002",
prompt="Oil painting of data scientist rejoicing
import os
input=["YOUR TEXT TO EMBED"]
after mastering a new AI skill."
import openai
import pandas as pd
from IPython.display import display, Markdown
# Extract the AI output embedding as a list of floats
# Retrieve the image from a URL & display
embedding = response["data"][0]["embedding"] from requests import get
# Set openai.api_key to the OPENAI environment variable
openai.api_key = os.environ["OPENAI"]
img_bytes = get(response["data"][0]["url"]).content
img = Image.open(BytesIO(img_bytes))
# List available models
Example Workflow display(img)
pd.json_normalize(openai.Model.list(), "data")
Embeddings are typically applied row-wise to text in a DataFrame. Consider this Dataframe, pizza, of pizza reviews (only 5
reviews shown; usually you want a bigger dataset).
Get the Image Directly
> Generate Text with GPT Review # Return generated image directly with response_format="b64_json"
response = openai.Image.create(
The best pizza I've ever eaten. The sauce was so tangy! prompt="Digital illustration of data scientist
Basic flow for Chat The pizza was disgusting. I think the pepperoni was made from rats. and a robot high-fiving.",
response_format="b64_json"
I ordered a hot-dog and was given a pizza, but I ate it anyway.
The GPT model supports chat functionality where you can insert a prompt and it responds with a message. Supported )
I hate pineapple on pizza. It is a disgrace. Somehow, it worked well on this pizza though.
models for chat are
I ate 11 slices and threw up. The pizza was tasty in both directions. # Decompress image & display
"gpt-4": GPT-4 (recommended for high-performance use # Helper function to get embeddings
from base64 import b64decode
"gpt-4-0314": GPT-4, snapshotted on 2023-03-1 def get_embedding(txt):
img_bytes = b64decode(response["data"][0]["b64_json"])
"gpt-4-32k": GPT-4 with 32k context (recommended for high performance, long chats txt = txt.replace("\n", " ")
img = Image.open(BytesIO(img_bytes))
"gpt-4-32k-0314": GPT-4 32k, snapshotted on 2023-03-1 response = openai.Embedding.create(
display(img)
"gpt-3.5-turbo": GPT-3.5 (recommended for cost-effective use model="text-embedding-ada-002",
"gpt-3.5-turbo-0301": GPT-3.5, snapshotted on 2023-03-01
input=[txt]
Control Output Quantity
)
There are three types of messages return response["data"][0]["embedding"]
# Return multiple images with n argument
system: Specifies how the AI assistant should behave response = openai.Image.create(
user: Specifies what you want the AI assistant to say prompt="A data scientist winning a medal in the data Olympics.",
assistant: Contains previous output from the AI assistant or specifies examples of desired AI output. # Get embedding for each row of a text column of a DataFrame
n=3
pizza["embedding"] = pizza["review"].apply(get_embedding) )
# Converse with GPT with openai.ChatCompletion.create()
# Access ith image URL or compressed bytes
response = openai.ChatCompletion.create(
response["data"][i]["url"]
model="gpt-3.5-turbo",
messages=[ {
> Convert Speech to Text with Whisper response["data"][i]["b64_json"]
:
"role" "system",
# Reduce the image size with the size argument
"content" :'
You are a stand-up comic performing to an audience of data Audio files can be converted to text. Supported file formats are mp3 , mp4, mpeg, mpga, m4a, wav, and webm. The output # Choices are 256x256, 512x512, 1024x1024 (default)
scientists. Your specialist genre is dad jokes. '
can be given in the original language or in English.
response = openai.Image.create(
}, {
prompt="A data scientist saving the world from alien attack.",
:
"role" "user",
Supported models
whisper-1: Whisper (recommended)
size="256x256"
"content" :'
Tell a joke about statistics. '
)
}, {
:
"role" "assistant",
Basic flow for transcription
"content" :'
My last was gig at a statistics conference. I told 100 jokes to try
and make people laugh. No pun in ten did. '
# Transcribe the file with openai.Audio.transcribe()
# Note that model is the second arg here, not the first
}
with open("audio.mp3", "rb") as audio_file:
]
transcript = openai.Audio.transcribe(
Learn AI Online at
www.DataCamp.com
)
file = audio_file,
model = "whisper-1",
# Check the response status
response_format="text",
response["choices"][0]["finish_reason"]
language="en"
)
# Extract the AI output content
ai_output = response["choices"][0]["message"]["content"]
Improve transcription performance
# Render the AI output content
# Include partial script in a prompt to guide to improve quality
display(Markdown(ai_output)) transcript = openai.Audio.transcribe(..., prompt="Welcome to DataFramed!")