Skip to content
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
1d17465
Add refactored OpenAI API server modules implementation
JustinTong0323 Jun 14, 2025
42bb560
Merge branch 'main' into refactor_oai_server_serving
JustinTong0323 Jun 14, 2025
d9ceddd
feat: add serving_embedding
JustinTong0323 Jun 14, 2025
f8d604b
Refactors request handling in OpenAI endpoints
JustinTong0323 Jun 14, 2025
a86bf27
Adds documentation to OpenAI API endpoints
JustinTong0323 Jun 14, 2025
5ddc8fc
Simplifies getting enable_thinking value
JustinTong0323 Jun 14, 2025
2ddbb40
rename serving_engine to serving_base
JustinTong0323 Jun 14, 2025
26771ad
Merge branch 'main' into refactor_oai_server_serving
JustinTong0323 Jun 14, 2025
4596b52
Makes chat template caching instance-specific
JustinTong0323 Jun 14, 2025
47d54dc
Refactors logprobs processing
JustinTong0323 Jun 14, 2025
8ac4349
Update python/sglang/srt/entrypoints/openai/protocol.py
JustinTong0323 Jun 14, 2025
00b202c
Improve test cases for eagle infer (#7173)
merrymercy Jun 14, 2025
fb4ae05
fix CI
JustinTong0323 Jun 14, 2025
81f5e41
Merge branch 'main' into refactor_oai_server_serving
JustinTong0323 Jun 14, 2025
2a10db7
Merge branch 'main' into refactor_oai_server_serving
JustinTong0323 Jun 14, 2025
3b28fdb
Removes unused utility functions
JustinTong0323 Jun 14, 2025
012bcb5
Refactors request validation for OpenAI endpoints
JustinTong0323 Jun 15, 2025
27341ae
Improves OpenAI serving base class logic
JustinTong0323 Jun 15, 2025
286751a
Refactors error handling for OpenAI endpoints
JustinTong0323 Jun 15, 2025
50d57d1
Refactors request ID generation
JustinTong0323 Jun 15, 2025
960f917
Removes RequestContext
JustinTong0323 Jun 15, 2025
30663a5
Simplifies enable_thinking handling and remove unused functions
JustinTong0323 Jun 15, 2025
eb6784d
Refactors sampling parameter building
JustinTong0323 Jun 15, 2025
47da102
Renames OpenAI serving handler classes
JustinTong0323 Jun 15, 2025
177efdc
Merge branch 'main' into refactor_oai_server_serving
JustinTong0323 Jun 15, 2025
c5a60e0
cleanup docs and imports
JustinTong0323 Jun 15, 2025
d433e43
Fixes usage calculation in streaming mode
JustinTong0323 Jun 15, 2025
ba42ea1
Refactors error response handling in OpenAIServingBase
JustinTong0323 Jun 16, 2025
48586bf
Apply suggestions from code review
JustinTong0323 Jun 16, 2025
3e03b74
Refactors test fixtures for clarity and remove some tests
JustinTong0323 Jun 16, 2025
ac908e1
Enables tool call constraint in sampling params
JustinTong0323 Jun 16, 2025
69e41f7
move the `text = content["text"]` in serving_chat for Better readability
JustinTong0323 Jun 16, 2025
590db9a
lint
JustinTong0323 Jun 16, 2025
4c140c8
remove redundant logic
JustinTong0323 Jun 16, 2025
7190e6f
logic for generate_completion_prompt
JustinTong0323 Jun 16, 2025
40e97fc
Add comments back
JustinTong0323 Jun 16, 2025
84f6037
Merge branch 'main' into refactor_oai_server_serving
JustinTong0323 Jun 16, 2025
b95a288
fix tests
JustinTong0323 Jun 16, 2025
cc28f37
fix lint
JustinTong0323 Jun 16, 2025
ea30a8c
Merge branch 'main' into refactor_oai_server_serving
zhyncs Jun 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions python/sglang/srt/entrypoints/openai/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Copyright 2023-2024 SGLang Team
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""
OpenAI-compatible API server module for SGLang.

This module provides OpenAI-compatible API endpoints that allow existing OpenAI client
applications to seamlessly work with SGLang models. The implementation includes:

Key Features:
- Full OpenAI API compatibility for chat completions, text completions, and embeddings
- Streaming support for real-time response generation
- Batch processing capabilities for multiple requests
- Function calling and tool use support
- Multimodal input support (text, images, audio)
- Advanced reasoning capabilities with separate reasoning content
- Custom sampling parameters and constraints (regex, JSON schema, EBNF)
- LoRA adapter support for fine-tuned models
- Cache reporting and token usage tracking

Supported Endpoints:
- /v1/chat/completions - Chat-based completions with conversation history
- /v1/completions - Text completions for single prompts
- /v1/embeddings - Text/multimodal embeddings generation
- /v1/models - Model listing and information

The module is structured with separate handlers for each endpoint type, all inheriting
from a common base class that provides shared functionality like request validation,
error handling, and response formatting.

Architecture:
- OpenAIServingBase: Abstract base class for all endpoint handlers
- ChatCompletionHandler: Handles chat completion requests
- CompletionHandler: Handles text completion requests
- EmbeddingHandler: Handles embedding requests
- Protocol classes: Pydantic models for request/response validation
- Utility functions: Shared helpers for formatting and validation
"""
Loading
Loading