# Cherry Studio 客户端优化 本文档说明针对 Cherry Studio 客户端的专属优化功能。 ## 优化内容 ### 1. 客户端名称日志显示 **功能描述**: - 从请求头 `x-title` 中提取客户端名称 - 在日志中显示客户端信息,便于追踪和调试 - 支持任何设置 `x-title` 头的客户端,不限于 Cherry Studio **日志格式**: ``` 2025-12-10 15:09:17 - api.routers.chat - INFO - Chat completion request for model: google.gemini-2.5-pro, client: Cherry Studio ``` **实现位置**: - [src/api/routers/chat.py](../src/api/routers/chat.py#L295-L296) **使用示例**: ```bash curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-oci-genai-default-key" \ -H "x-title: Cherry Studio" \ -d '{ "model": "google.gemini-2.5-pro", "messages": [{"role": "user", "content": "Hello"}] }' ``` ### 2. thinking_budget 到 reasoning_effort 的自动映射 **功能描述**: - Cherry Studio 使用 Google Gemini 的 `thinking_budget` 参数控制推理深度 - 网关自动将 `thinking_budget` 映射到 OCI SDK 的 `reasoning_effort` 参数 - 支持 meta、xai、google、openai 提供商的模型(不支持 Cohere) - 对其他客户端透明,不影响标准 OpenAI API 兼容性 **映射规则**: | thinking_budget 值 | reasoning_effort | 说明 | |-------------------|------------------|------| | ≤ 1760 | `low` | 快速响应,较少推理 | | 1760 < X ≤ 16448 | `medium` | 平衡速度和推理深度 | | > 16448 | `high` | 深度推理,更完整的答案 | | -1 | None | 使用模型默认值 | **extra_body 结构**: Cherry Studio 通过 `extra_body` 传递 Google Gemini 特定的配置: ```json { "model": "google.gemini-2.5-pro", "messages": [...], "extra_body": { "google": { "thinking_config": { "thinking_budget": 1760, "include_thoughts": true } } } } ``` **实现位置**: - 映射函数: [src/api/routers/chat.py](../src/api/routers/chat.py#L37-L102) - `map_thinking_budget_to_reasoning_effort()` - 将 thinking_budget 数值映射到 reasoning_effort 枚举值 - `extract_reasoning_effort_from_extra_body()` - 从 extra_body 中提取 thinking_budget 并执行映射 - OCI 客户端: [src/core/oci_client.py](../src/core/oci_client.py#L333-L336) **日志输出**: ``` 2025-12-10 15:09:17 - api.routers.chat - INFO - Chat completion request for model: google.gemini-2.5-pro, client: Cherry Studio 2025-12-10 15:09:17 - api.routers.chat - INFO - Cherry Studio thinking_budget 1760 mapped to reasoning_effort: low 2025-12-10 15:09:17 - core.oci_client - INFO - Setting reasoning_effort to LOW for google model ``` ## Cherry Studio 使用示例 ### 基本对话 ```bash curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-oci-genai-default-key" \ -H "x-title: Cherry Studio" \ -d '{ "model": "google.gemini-2.5-pro", "messages": [ {"role": "user", "content": "Hello, how are you?"} ] }' ``` ### 使用 thinking_budget (低推理深度) ```bash curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-oci-genai-default-key" \ -H "x-title: Cherry Studio" \ -d '{ "model": "google.gemini-2.5-pro", "messages": [ {"role": "user", "content": "What is 2+2?"} ], "extra_body": { "google": { "thinking_config": { "thinking_budget": 1000 } } } }' ``` ### 使用 thinking_budget (中等推理深度) ```bash curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-oci-genai-default-key" \ -H "x-title: Cherry Studio" \ -d '{ "model": "google.gemini-2.5-pro", "messages": [ {"role": "user", "content": "Explain quantum entanglement"} ], "extra_body": { "google": { "thinking_config": { "thinking_budget": 5000 } } } }' ``` ### 使用 thinking_budget (高推理深度) ```bash curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-oci-genai-default-key" \ -H "x-title: Cherry Studio" \ -d '{ "model": "google.gemini-2.5-pro", "messages": [ {"role": "user", "content": "Solve this complex math problem: ..."} ], "extra_body": { "google": { "thinking_config": { "thinking_budget": 20000 } } } }' ``` ## 验证日志 启动服务并查看日志以验证 Cherry Studio 优化功能: ```bash # 启动服务(开发模式) cd src python main.py # 查看日志(另一个终端) tail -f logs/app.log | grep -E "(client:|thinking_budget|reasoning_effort)" ``` 期望看到的日志: ``` 2025-12-10 15:09:17 - api.routers.chat - INFO - Chat completion request for model: google.gemini-2.5-pro, client: Cherry Studio 2025-12-10 15:09:17 - api.routers.chat - INFO - Cherry Studio thinking_budget 1760 mapped to reasoning_effort: low 2025-12-10 15:09:17 - core.oci_client - INFO - Setting reasoning_effort to LOW for google model ``` ## 技术实现 ### Schema 变更 在 [src/api/schemas.py](../src/api/schemas.py) 中添加了 `extra_body` 字段: ```python class ChatCompletionRequest(BaseModel): # ... 其他字段 ... extra_body: Optional[Dict[str, Any]] = None # Cherry Studio and other client extensions ``` ### 映射函数 实现了两个工具函数来处理 Cherry Studio 的 thinking_budget: 1. **map_thinking_budget_to_reasoning_effort**: 将 thinking_budget 数值映射到 reasoning_effort 枚举值 2. **extract_reasoning_effort_from_extra_body**: 从 extra_body 中提取 thinking_budget 并执行映射 ```python def map_thinking_budget_to_reasoning_effort(thinking_budget: int) -> Optional[str]: """Map Cherry Studio's thinking_budget to OCI's reasoning_effort parameter.""" if thinking_budget == -1: return None elif thinking_budget <= 1760: return "low" elif thinking_budget <= 16448: return "medium" else: return "high" def extract_reasoning_effort_from_extra_body(extra_body: Optional[dict]) -> Optional[str]: """Extract reasoning_effort from Cherry Studio's extra_body parameter.""" if not extra_body: return None try: google_config = extra_body.get("google", {}) thinking_config = google_config.get("thinking_config", {}) thinking_budget = thinking_config.get("thinking_budget") if thinking_budget is not None and isinstance(thinking_budget, (int, float)): effort = map_thinking_budget_to_reasoning_effort(int(thinking_budget)) if effort: logger.info(f"Cherry Studio thinking_budget {thinking_budget} mapped to reasoning_effort: {effort}") return effort except (AttributeError, TypeError, KeyError) as e: logger.debug(f"Failed to extract thinking_budget from extra_body: {e}") return None ``` ### OCI SDK 集成 更新了 `OCIGenAIClient.chat()` 方法和 `_build_generic_request()` 方法,支持传递 `reasoning_effort` 参数到 OCI SDK 的 `GenericChatRequest`。 ## 兼容性 ### 支持的模型 **reasoning_effort 参数支持**(通过 thinking_budget 映射): - ✅ Google Gemini 模型 (google.gemini-2.5-pro, google.gemini-2.0-flash-exp) - ✅ Meta Llama 模型 (meta.llama-3.1-405b-instruct, meta.llama-3.2-90b-vision-instruct) - ✅ xAI 模型 - ✅ OpenAI 模型 - ❌ Cohere 模型(不支持 reasoning_effort 参数) **注意**: reasoning_effort 是可选参数,如果模型不支持,会自动忽略并记录警告日志。 ### 向后兼容性 - ✅ 不提供 `extra_body` 时,行为与之前完全一致 - ✅ 不提供 `x-title` 时,客户端名称显示为 "Unknown" - ✅ 其他客户端不受影响,可以继续正常使用 - ✅ 标准 OpenAI API 兼容性完全保留 ### 与其他客户端的兼容性 虽然此优化专为 Cherry Studio 设计,但实现方式确保了: 1. **其他客户端不受影响**:不使用 `extra_body.google.thinking_config` 的客户端完全不受影响 2. **标准 API 兼容**:所有标准 OpenAI API 功能仍然正常工作 ## 故障排除 ### 问题 1: thinking_budget 参数未生效 **症状**:日志中没有看到 "mapped to reasoning_effort" 消息 **解决方案**: 1. 确认 `extra_body` 结构正确,嵌套路径为 `extra_body.google.thinking_config.thinking_budget` 2. 确认使用的是支持的模型(meta、xai、google、openai,不支持 Cohere) 3. 检查 thinking_budget 值是否有效(非 null 的数字) 4. 查看日志中是否有错误或警告信息 **验证 extra_body 结构**: ```bash # 正确的结构 { "extra_body": { "google": { # 必须是 "google" 键 "thinking_config": { # 必须是 "thinking_config" 键 "thinking_budget": 5000 # 必须是 "thinking_budget" 键,值为数字 } } } } ``` ### 问题 2: 客户端名称显示为 "Unknown" **症状**:日志中客户端显示为 "Unknown" 而不是 "Cherry Studio" **解决方案**: 1. 确认请求头中包含 `x-title` 字段 2. 检查 Cherry Studio 是否正确设置了自定义请求头 3. 尝试手动添加请求头进行测试 **测试命令**: ```bash curl http://localhost:8000/v1/chat/completions \ -H "x-title: Cherry Studio" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-oci-genai-default-key" \ -d '{"model": "google.gemini-2.5-pro", "messages": [{"role": "user", "content": "test"}]}' ``` ### 问题 3: thinking_budget 映射到错误的 reasoning_effort **症状**:期望的 reasoning_effort 与实际不符 **验证映射规则**: - thinking_budget ≤ 1760 → low - 1760 < thinking_budget ≤ 16448 → medium - thinking_budget > 16448 → high - thinking_budget = -1 → None (使用模型默认) **示例**: ```python # thinking_budget = 1000 → low ✓ # thinking_budget = 5000 → medium ✓ # thinking_budget = 20000 → high ✓ # thinking_budget = -1 → None (默认) ✓ ``` ## 测试 ### 自动化测试 运行 Cherry Studio 优化测试脚本: ```bash ./tests/test_cherry_studio_optimization.sh ``` 测试脚本会验证以下场景: 1. thinking_budget = 1000 → reasoning_effort = low 2. thinking_budget = 5000 → reasoning_effort = medium 3. thinking_budget = 20000 → reasoning_effort = high 4. thinking_budget = -1 → 使用模型默认值 5. 无 extra_body(正常请求) 6. 不同客户端名称(验证 x-title 识别) ## 参考资料 - [OCI GenAI Python SDK - GenericChatRequest](https://docs.oracle.com/en-us/iaas/tools/python/latest/api/generative_ai_inference/models/oci.generative_ai_inference.models.GenericChatRequest.html) - [OpenAI API - Reasoning Models](https://platform.openai.com/docs/guides/reasoning) - [Google Gemini - Thinking](https://ai.google.dev/gemini-api/docs/thinking)