Spaces:

yym68686
/

uni-api

Sleeping

App Files Files Community

yym68686 commited on Sep 19, 2024

Commit

eb02b52

1 Parent(s): 252357c

🤖 Models: Add support for the cohere series models

Browse files

Files changed (6) hide show

README.md +3 -3
README_CN.md +3 -3
main.py +5 -1
request.py +75 -5
response.py +26 -0
test/test_matplotlib.py +72 -1

README.md CHANGED Viewed

@@ -13,13 +13,13 @@
 ## Introduction
-If used personally, one/new-api is too complex and has many commercial functions that individuals do not need. If you do not want a complicated front-end interface and want to support more models, you can try uni-api. This is a project for unified management of large model APIs, allowing you to call multiple backend services through a unified API interface, converting them uniformly to OpenAI format and supporting load balancing. Currently supported backend services include: OpenAI, Anthropic, Gemini, Vertex, Cloudflare, DeepBricks, OpenRouter, etc.
 ## Features
 - No frontend, pure configuration file setup for API channels. You can run your own API site by just writing one file, with detailed configuration guides in the documentation, beginner-friendly.
 - Unified management of multiple backend services, supporting providers like OpenAI, Deepseek, DeepBricks, OpenRouter, and other APIs in the OpenAI format. Supports OpenAI Dalle-3 image generation.
-- Supports Anthropic, Gemini, Vertex API, and Cloudflare. Vertex supports both Claude and Gemini API.
 - Supports OpenAI, Anthropic, Gemini, Vertex native tool use function calls.
 - Supports OpenAI, Anthropic, Gemini, Vertex native image recognition API.
 - Supports four types of load balancing.
@@ -125,7 +125,7 @@ api_keys:
 ## Environment Variables
 - CONFIG_URL: The download address of the configuration file, it can be a local file or a remote file, optional
-- TIMEOUT: Request timeout, default is 40 seconds. The timeout can control the time needed to switch to the next channel when a channel does not respond. Optional.
 ## Docker Local Deployment

 ## Introduction
+If used for personal purposes, one/new-api is too complex and has many commercial features that individuals do not need. If you do not want a complex front-end interface and want to support more models, you can try uni-api. This is a project that manages large model APIs uniformly and allows you to call multiple backend services through a unified API interface, converting them uniformly to the OpenAI format and supporting load balancing. The currently supported backend services include: OpenAI, Anthropic, Gemini, Vertex, Cohere, Cloudflare, DeepBricks, OpenRouter, etc.
 ## Features
 - No frontend, pure configuration file setup for API channels. You can run your own API site by just writing one file, with detailed configuration guides in the documentation, beginner-friendly.
 - Unified management of multiple backend services, supporting providers like OpenAI, Deepseek, DeepBricks, OpenRouter, and other APIs in the OpenAI format. Supports OpenAI Dalle-3 image generation.
+- Supports Anthropic, Gemini, Vertex AI, Cohere, Cloudflare. Vertex supports both Claude and Gemini API.
 - Supports OpenAI, Anthropic, Gemini, Vertex native tool use function calls.
 - Supports OpenAI, Anthropic, Gemini, Vertex native image recognition API.
 - Supports four types of load balancing.
 ## Environment Variables
 - CONFIG_URL: The download address of the configuration file, it can be a local file or a remote file, optional
+- TIMEOUT: Request timeout, default is 100 seconds, the timeout can control the time needed to switch to the next channel when a channel does not respond. Optional
 ## Docker Local Deployment

README_CN.md CHANGED Viewed

@@ -13,13 +13,13 @@
 ## Introduction
-如果个人使用的话，one/new-api 过于复杂，有很多个人不需要使用的商用功能，如果你不想要复杂的前端界面，有想要支持的模型多一点，可以试试 uni-api。这是一个统一管理大模型API的项目，可以通过一个统一的API接口调用多个后端服务，统一转换为 OpenAI 格式，支持负载均衡。目前支持的后端服务有：OpenAI、Anthropic、Gemini、Vertex、cloudflare、DeepBricks、OpenRouter 等。
 ## Features
 - 无前端，纯配置文件配置 API 渠道。只要写一个文件就能运行起一个属于自己的 API 站，文档有详细的配置指南，小白友好。
 - 统一管理多个后端服务，支持 OpenAI、Deepseek、DeepBricks、OpenRouter 等其他 API 是 OpenAI 格式的提供商。支持 OpenAI Dalle-3 图像生成。
-- 同时支持 Anthropic、Gemini、Vertex API、cloudflare。Vertex 同时支持 Claude 和 Gemini API。
 - 支持 OpenAI、 Anthropic、Gemini、Vertex 原生 tool use 函数调用。
 - 支持 OpenAI、Anthropic、Gemini、Vertex 原生识图 API。
 - 支持四种负载均衡。
@@ -125,7 +125,7 @@ api_keys:
 ## 环境变量
 - CONFIG_URL: 配置文件的下载地址，可以是本地文件，也可以是远程文件，选填
-- TIMEOUT: 请求超时时间，默认为 40 秒，超时时间可以控制当一个渠道没有响应时，切换下一个渠道需要的时间。选填
 ## Docker Local Deployment

 ## Introduction
+如果个人使用的话，one/new-api 过于复杂，有很多个人不需要使用的商用功能，如果你不想要复杂的前端界面，有想要支持的模型多一点，可以试试 uni-api。这是一个统一管理大模型API的项目，可以通过一个统一的API接口调用多个后端服务，统一转换为 OpenAI 格式，支持负载均衡。目前支持的后端服务有：OpenAI、Anthropic、Gemini、Vertex、Cohere、Cloudflare、DeepBricks、OpenRouter 等。
 ## Features
 - 无前端，纯配置文件配置 API 渠道。只要写一个文件就能运行起一个属于自己的 API 站，文档有详细的配置指南，小白友好。
 - 统一管理多个后端服务，支持 OpenAI、Deepseek、DeepBricks、OpenRouter 等其他 API 是 OpenAI 格式的提供商。支持 OpenAI Dalle-3 图像生成。
+- 同时支持 Anthropic、Gemini、Vertex AI、Cohere、Cloudflare。Vertex 同时支持 Claude 和 Gemini API。
 - 支持 OpenAI、 Anthropic、Gemini、Vertex 原生 tool use 函数调用。
 - 支持 OpenAI、Anthropic、Gemini、Vertex 原生识图 API。
 - 支持四种负载均衡。
 ## 环境变量
 - CONFIG_URL: 配置文件的下载地址，可以是本地文件，也可以是远程文件，选填
+- TIMEOUT: 请求超时时间，默认为 100 秒，超时时间可以控制当一个渠道没有响应时，切换下一个渠道需要的时间。选填
 ## Docker Local Deployment

main.py CHANGED Viewed

@@ -229,13 +229,17 @@ async def process_request(request: Union[RequestModel, ImageGenerationRequest],
         engine = "claude"
     elif parsed_url.netloc == 'openrouter.ai':
         engine = "openrouter"
     else:
         engine = "gpt"
     if "claude" not in provider['model'][request.model] \
     and "gpt" not in provider['model'][request.model] \
     and "gemini" not in provider['model'][request.model] \
-    and parsed_url.netloc != 'api.cloudflare.com':
         engine = "openrouter"
     if "claude" in provider['model'][request.model] and engine == "vertex":

         engine = "claude"
     elif parsed_url.netloc == 'openrouter.ai':
         engine = "openrouter"
+    elif parsed_url.netloc == 'api.cohere.com':
+        engine = "cohere"
+        request.stream = True
     else:
         engine = "gpt"
     if "claude" not in provider['model'][request.model] \
     and "gpt" not in provider['model'][request.model] \
     and "gemini" not in provider['model'][request.model] \
+    and parsed_url.netloc != 'api.cloudflare.com' \
+    and parsed_url.netloc != 'api.cohere.com':
         engine = "openrouter"
     if "claude" in provider['model'][request.model] and engine == "vertex":

request.py CHANGED Viewed

@@ -1,12 +1,13 @@
 import os
 import re
 import json
-from models import RequestModel
-from utils import c35s, c3s, c3o, c3h, gem, BaseAPI
 import base64
 import urllib.parse
 def encode_image(image_path):
   with open(image_path, "rb") as image_file:
     return base64.b64encode(image_file.read()).decode('utf-8')
@@ -82,6 +83,8 @@ async def get_text_message(role, message, engine = None):
         return {"text": message}
     if engine == "cloudflare":
         return message
     raise ValueError("Unknown engine")
 async def get_gemini_payload(request, engine, provider):
@@ -215,8 +218,6 @@ async def get_gemini_payload(request, engine, provider):
     return url, headers, payload
 import time
-import httpx
-import base64
 from cryptography.hazmat.primitives import hashes
 from cryptography.hazmat.primitives.asymmetric import padding
 from cryptography.hazmat.primitives.serialization import load_pem_private_key
@@ -690,6 +691,73 @@ async def get_openrouter_payload(request, engine, provider):
     return url, headers, payload
 async def get_cloudflare_payload(request, engine, provider):
     headers = {
         'Content-Type': 'application/json'
@@ -989,6 +1057,8 @@ async def get_payload(request: RequestModel, engine, provider):
         return await get_cloudflare_payload(request, engine, provider)
     elif engine == "o1":
         return await get_o1_payload(request, engine, provider)
     elif engine == "dalle":
         return await get_dalle_payload(request, engine, provider)
     else:

 import os
 import re
 import json
+import httpx
 import base64
 import urllib.parse
+from models import RequestModel
+from utils import c35s, c3s, c3o, c3h, gem, BaseAPI
 def encode_image(image_path):
   with open(image_path, "rb") as image_file:
     return base64.b64encode(image_file.read()).decode('utf-8')
         return {"text": message}
     if engine == "cloudflare":
         return message
+    if engine == "cohere":
+        return message
     raise ValueError("Unknown engine")
 async def get_gemini_payload(request, engine, provider):
     return url, headers, payload
 import time
 from cryptography.hazmat.primitives import hashes
 from cryptography.hazmat.primitives.asymmetric import padding
 from cryptography.hazmat.primitives.serialization import load_pem_private_key
     return url, headers, payload
+async def get_cohere_payload(request, engine, provider):
+    headers = {
+        'Content-Type': 'application/json'
+    }
+    if provider.get("api"):
+        headers['Authorization'] = f"Bearer {provider['api'].next()}"
+    url = provider['base_url']
+    role_map = {
+        "user": "USER",
+        "assistant" : "CHATBOT",
+        "system": "SYSTEM"
+    }
+    messages = []
+    for msg in request.messages:
+        if isinstance(msg.content, list):
+            content = []
+            for item in msg.content:
+                if item.type == "text":
+                    text_message = await get_text_message(msg.role, item.text, engine)
+                    content.append(text_message)
+        else:
+            content = msg.content
+        if isinstance(content, list):
+            for item in content:
+                if item["type"] == "text":
+                    messages.append({"role": role_map[msg.role], "message": item["text"]})
+        else:
+            messages.append({"role": role_map[msg.role], "message": content})
+    model = provider['model'][request.model]
+    chat_history = messages[:-1]
+    query = messages[-1].get("message")
+    payload = {
+        "model": model,
+        "message": query,
+    }
+    if chat_history:
+        payload["chat_history"] = chat_history
+    miss_fields = [
+        'model',
+        'messages',
+        'tools',
+        'tool_choice',
+        'temperature',
+        'top_p',
+        'max_tokens',
+        'presence_penalty',
+        'frequency_penalty',
+        'n',
+        'user',
+        'include_usage',
+        'logprobs',
+        'top_logprobs'
+    ]
+    for field, value in request.model_dump(exclude_unset=True).items():
+        if field not in miss_fields and value is not None:
+            payload[field] = value
+    return url, headers, payload
 async def get_cloudflare_payload(request, engine, provider):
     headers = {
         'Content-Type': 'application/json'
         return await get_cloudflare_payload(request, engine, provider)
     elif engine == "o1":
         return await get_o1_payload(request, engine, provider)
+    elif engine == "cohere":
+        return await get_cohere_payload(request, engine, provider)
     elif engine == "dalle":
         return await get_dalle_payload(request, engine, provider)
     else:

response.py CHANGED Viewed

@@ -184,6 +184,29 @@ async def fetch_cloudflare_response_stream(client, url, headers, payload, model)
                         sse_string = await generate_sse_response(timestamp, model, content=message)
                         yield sse_string
 async def fetch_claude_response_stream(client, url, headers, payload, model):
     timestamp = int(datetime.timestamp(datetime.now()))
     async with client.stream('POST', url, headers=headers, json=payload) as response:
@@ -270,6 +293,9 @@ async def fetch_response_stream(client, url, headers, payload, engine, model):
         elif engine == "cloudflare":
             async for chunk in fetch_cloudflare_response_stream(client, url, headers, payload, model):
                 yield chunk
         else:
             raise ValueError("Unknown response")
     except httpx.ConnectError as e:

                         sse_string = await generate_sse_response(timestamp, model, content=message)
                         yield sse_string
+async def fetch_cohere_response_stream(client, url, headers, payload, model):
+    timestamp = int(datetime.timestamp(datetime.now()))
+    async with client.stream('POST', url, headers=headers, json=payload) as response:
+        error_message = await check_response(response, "fetch_gpt_response_stream")
+        if error_message:
+            yield error_message
+            return
+        buffer = ""
+        async for chunk in response.aiter_text():
+            buffer += chunk
+            while "\n" in buffer:
+                line, buffer = buffer.split("\n", 1)
+                # logger.info("line: %s", repr(line))
+                resp: dict = json.loads(line)
+                if resp.get("is_finished") == True:
+                    yield "data: [DONE]\n\r\n"
+                    return
+                if resp.get("event_type") == "text-generation":
+                    message = resp.get("text")
+                    sse_string = await generate_sse_response(timestamp, model, content=message)
+                    yield sse_string
 async def fetch_claude_response_stream(client, url, headers, payload, model):
     timestamp = int(datetime.timestamp(datetime.now()))
     async with client.stream('POST', url, headers=headers, json=payload) as response:
         elif engine == "cloudflare":
             async for chunk in fetch_cloudflare_response_stream(client, url, headers, payload, model):
                 yield chunk
+        elif engine == "cohere":
+            async for chunk in fetch_cohere_response_stream(client, url, headers, payload, model):
+                yield chunk
         else:
             raise ValueError("Unknown response")
     except httpx.ConnectError as e:

test/test_matplotlib.py CHANGED Viewed

@@ -2,6 +2,7 @@ import json
 import matplotlib.pyplot as plt
 from datetime import datetime, timedelta
 from collections import defaultdict
 import matplotlib.font_manager as fm
 font_path = '/System/Library/Fonts/PingFang.ttc'
@@ -45,5 +46,75 @@ def create_pic(request_arrivals, key):
     # 保存图片
     plt.savefig(f'{key.replace("/", "")}.png')
 if __name__ == '__main__':
-    create_pic(request_arrivals, 'POST /v1/chat/completions')

 import matplotlib.pyplot as plt
 from datetime import datetime, timedelta
 from collections import defaultdict
+import numpy as np
 import matplotlib.font_manager as fm
 font_path = '/System/Library/Fonts/PingFang.ttc'
     # 保存图片
     plt.savefig(f'{key.replace("/", "")}.png')
+def create_pie_chart(model_counts):
+    models = list(model_counts.keys())
+    counts = list(model_counts.values())
+    # 设置颜色和排列顺序
+    colors = plt.cm.Set3(np.linspace(0, 1, len(models)))
+    sorted_data = sorted(zip(counts, models, colors), reverse=True)
+    counts, models, colors = zip(*sorted_data)
+    # 创建饼图
+    fig, ax = plt.subplots(figsize=(16, 10))
+    wedges, _ = ax.pie(counts, colors=colors, startangle=90, wedgeprops=dict(width=0.5))
+    # 添加圆环效果
+    centre_circle = plt.Circle((0, 0), 0.35, fc='white')
+    fig.gca().add_artist(centre_circle)
+    # 计算总数
+    total = sum(counts)
+    # 准备标注
+    bbox_props = dict(boxstyle="round,pad=0.3", fc="w", ec="k", lw=0.72)
+    kw = dict(xycoords='data', textcoords='data', arrowprops=dict(arrowstyle="-"), bbox=bbox_props, zorder=0)
+    left_labels = []
+    right_labels = []
+    for i, p in enumerate(wedges):
+        ang = (p.theta2 - p.theta1) / 2. + p.theta1
+        y = np.sin(np.deg2rad(ang))
+        x = np.cos(np.deg2rad(ang))
+        percentage = counts[i] / total * 100
+        label = f"{models[i]}: {percentage:.1f}%"
+        if x > 0:
+            right_labels.append((x, y, label))
+        else:
+            left_labels.append((x, y, label))
+    # 绘制左侧标注
+    for i, (x, y, label) in enumerate(left_labels):
+        ax.annotate(label, xy=(x, y), xytext=(-1.2, 0.9 - i * 0.15), **kw)
+    # 绘制右侧标注
+    for i, (x, y, label) in enumerate(right_labels):
+        ax.annotate(label, xy=(x, y), xytext=(1.2, 0.9 - i * 0.15), **kw)
+    plt.title("各模型使用次数对比", size=16)
+    ax.set_xlim(-1.5, 1.5)
+    ax.set_ylim(-1.2, 1.2)
+    ax.axis('off')
+    plt.tight_layout()
+    plt.savefig('model_usage_pie_chart.png', bbox_inches='tight', pad_inches=0.5)
 if __name__ == '__main__':
+    model_counts = {
+        "model_counts": {
+            "claude-3-5-sonnet": 94,
+            "o1-preview": 71,
+            "gpt-4o": 512,
+            "gpt-4o-mini": 5,
+            "gemini-1.5-pro": 5,
+            "deepseek-chat": 7,
+            "grok-2-mini": 1,
+            "grok-2": 9,
+            "o1-mini": 8
+        }
+    }
+    # create_pic(request_arrivals, 'POST /v1/chat/completions')
+    create_pie_chart(model_counts["model_counts"])