Last Updated on 2024-09-07 by Clay
Recently, I integrated several applications of Outlines into my current workflow. Among them, the one I use most frequently is with vLLM. However, for some reason, its documentation has not been merged into the vLLM GitHub repository, so while designing the process, I had to constantly refer to the source code of a rejected PR for guidance XD
In light of this, I decided to compile a set of notes for my own reference, which I am documenting here.
If readers would like to learn more about Outlines and Finite-State Machines (FSM), they might want to check out my previous notes:
- Implementation of Using Finite-State Machine to Constrain Large Language Model Decoding
- Structuring Model Outputs Using the Outlines Tool
How to Use Outlines in vLLM
Specific JSON Format
We can use the Pydantic format to specify the JSON format that the LLM needs to generate.
import json
import requests
from pydantic import BaseModel, Field
class Answer(BaseModel):
is_human: bool
age: int = Field(..., ge=0, le=5)
metric_schema = Answer.model_json_schema()
input_data = {
"model": model,
"guided_json": metric_schema,
"messages": [
{
"role": "user",
"content": "You are a helpful assistant.",
},
{
"role": "assistant",
"content": "Nice to meet you!",
},
{
"role": "user",
"content": "How old are you? Are you a human?",
}
]
}
response = requests.post(url=vllm_url, json=input_data)
print(json.loads(response.json()["choices"][0]["message"]["content"]))
Output:
{'is_human': False, 'age': 0}
Regular Expression
import json
import requests
regex_pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
input_data = {
"model": model,
"guided_regex": regex_pattern,
"messages": [
{
"role": "user",
"content": "You are a helpful assistant.",
},
{
"role": "assistant",
"content": "Nice to meet you!",
},
{
"role": "user",
"content": "What is the IP address of the Google DNS servers?",
}
]
}
response = requests.post(url=vllm_url, json=input_data)
print(response.json()["choices"][0]["message"]["content"])
Output:
8.8.8.8
Multiple Choice
import requests
choices = ["Positive", "Negative"]
input_data = {
"model": model,
"guided_choice": choices,
"messages": [
{
"role": "user",
"content": "You are a helpful assistant.",
},
{
"role": "assistant",
"content": "Nice to meet you!",
},
{
"role": "user",
"content": "How do you feel which emotion that I have: I'm glad to help you!",
}
]
}
response = requests.post(url=vllm_url, json=input_data)
print(response.json()["choices"][0]["message"]["content"])
Output:
Positive
From the above examples, we can see that vLLM fully supports the application of the Outlines tool. I’m documenting this for future reference.
References
- https://github.com/vllm-project/vllm/pull/6045/files
- https://github.com/vllm-project/vllm/tree/main/examples