人工智能 | 搭建企业内部的大语言模型系统|电子爱好者

admin管理员组
文章数量:1658610

大纲

开源大语言模型
大语言模型管理
私有大语言模型服务部署方案

开源大语言模型

担心安全与隐私？可私有部署的开源大模型

商业大模型，不支持私有部署
- ChatGPT
- Claude
- Google Gemini
- 百度问心一言
开源大模型，支持私有部署
- Mistral
- Meta Llama
- ChatGLM
- 阿里通义千问

常用开源大模型列表

开源大模型分支

大语言模型管理

大语言模型管理工具

HuggingFace 全面的大语言模型管理平台
Ollama 在本地管理大语言模型，下载速度超快
llama.cpp 在本地和云端的各种硬件上以最少的设置和最先进的性能实现 LLM 推理
GPT4All 一个免费使用、本地运行、具有隐私意识的聊天机器人。无需 GPU 或互联网

Ollama 速度最快的大语言模型管理工具

Ollama 的命令

ollama pull llama2ollama listollama run llama2 "Summarize this file: $(cat README.md)"
ollama serve
curl http://localhost:11434/api/generate -d '{  "model": "llama2",  "prompt":"Why is the sky blue?"}'curl http://localhost:11434/api/chat -d '{  "model": "mistral",  "messages": [    { "role": "user", "content": "why is the sky blue?" }  ]}'

大语言模型的前端

大语言模型的应用前端

开源平台 ollama-chatbot、PrivateGPT、gradio
开源服务 hugging face TGI、langchain-serve
开源框架 langchain llama-index

ollama chatbot

docker run -p 3000:3000 ghcr.io/ivanfioravanti/chatbot-ollama:main## http://localhost:3000

ollama chatbot

PrivateGPT

PrivateGPT 提供了一个 API，其中包含构建私有的、上下文感知的 AI 应用程序所需的所有构建块。该 API 遵循并扩展了 OpenAI API 标准，支持普通响应和流响应。这意味着，如果您可以在您的工具之一中使用 OpenAI API，则可以使用您自己的 PrivateGPT API，无需更改代码，并且如果您在本地模式下运行 privateGPT，则免费。

PrivateGPT 架构

FastAPI
LLamaIndex
支持本地 LLM，比如 ChatGLM llama Mistral
支持远程 LLM，比如 OpenAI Claud
支持嵌入 embeddings，比如 ollama embeddings-huggingface
支持向量存储，比如 Qdrant, ChromaDB and Postgres

PrivateGPT 环境准备

git clone https://github/imartinez/privateGPTcd privateGPT#不支持3.11之前的版本python3.11 -m venv .venvsource .venv/bin/activatepip install --upgrade pip poetry
#虽然官网只说了要安装少部分的依赖，但是那些依赖管理不是那么完善，容易有遗漏#所以我们的策略就是全都要。poetry install --extras "ui llms-llama-cpp llms-openai llms-openai-like llms-ollama llms-sagemaker llms-azopenai embeddings-ollama embeddings-huggingface embeddings-openai embeddings-sagemaker embeddings-azopenai vector-stores-qdrant vector-stores-chroma vector-stores-postgres storage-nodestore-postgres"
#或者用这个安装脚本#poetry install --extras "$(sed -n '/tool.poetry.extras/,/^$/p'  pyproject.toml | awk -F= 'NR>1{print $1}' | xargs)"

ollama 部署方式

ollama pull mistralollama pull nomic-embed-textollama serve
#官方这个依赖不够，还需要额外安装torch，所以尽量采用上面提到的全部安装的策略poetry install --extras "ui llms-ollama embeddings-ollama vector-stores-qdrant"PGPT_PROFILES=ollama poetry run python -m private_gpt

setting-ollama.yaml

server:  env_name: ${APP_ENV:ollama}
llm:  mode: ollama  max_new_tokens: 512  context_window: 3900  temperature: 0.1 #The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual. (Default: 0.1)
embedding:  mode: ollama
ollama:  llm_model: mistral  embedding_model: nomic-embed-text  api_base: http://localhost:11434  tfs_z: 1.0 ## Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting.  top_k: 40 ## Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)  top_p: 0.9 ## Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)  repeat_last_n: 64 ## Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)  repeat_penalty: 1.2 ## Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
vectorstore:  database: qdrant
qdrant:  path: local_data/private_gpt/qdrant

启动

PGPT_PROFILES=ollama poetry run python -m private_gpt
poetry run python -m private_gpt02:36:06.928 [INFO    ] private_gpt.settings.settings_loader - Starting application with profiles=['default', 'ollama']02:36:46.567 [INFO    ] private_gptponents.llm.llm_component - Initializing the LLM in mode=ollama02:36:47.405 [INFO    ] private_gptponents.embedding.embedding_component - Initializing the embedding model in mode=ollama02:36:47.414 [INFO    ] llama_index.core.indices.loading - Loading all indices.02:36:47.571 [INFO    ]         private_gpt.ui.ui - Mounting the gradio UI, at path=/02:36:47.620 [INFO    ]             uvicorn.error - Started server process [72677]02:36:47.620 [INFO    ]             uvicorn.error - Waiting for application startup.02:36:47.620 [INFO    ]             uvicorn.error - Application startup complete.02:36:47.620 [INFO    ]             uvicorn.error - Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit)

PrivateGPT UI

local 部署模式

#todo: 需要安装llama-cpp，每个平台的安装方式都不同，参考官方文档
poetry run python scripts/setupPGPT_PROFILES=local poetry run python -m private_gpt

setting-local.yaml

server:  env_name: ${APP_ENV:local}
llm:  mode: llamacpp  ## Should be matching the selected model  max_new_tokens: 512  context_window: 3900  tokenizer: mistralai/Mistral-7B-Instruct-v0.2
llamacpp:  prompt_style: "mistral"  llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.2-GGUF  llm_hf_model_file: mistral-7b-instruct-v0.2.Q4_K_M.gguf
embedding:  mode: huggingface
huggingface:  embedding_hf_model_name: BAAI/bge-small-en-v1.5
vectorstore:  database: qdrant
qdrant:  path: local_data/private_gpt/qdrant

非私有 OpenAI-powered 部署

poetry install --extras "ui llms-openai embeddings-openai vector-stores-qdrant"PGPT_PROFILES=openai poetry run python -m private_gpt

setting-openai.yaml

server:  env_name: ${APP_ENV:openai}
llm:  mode: openai
embedding:  mode: openai
openai:  api_key: ${OPENAI_API_KEY:}  model: gpt-3.5-turbo

openai 风格的 API 调用

The API is built using FastAPI and follows OpenAI's API scheme.
The RAG pipeline is based on LlamaIndex.

curl -X POST http://localhost:8000/v1/completions \     -H "Content-Type: application/json" \     -d '{  "prompt": "string",  "stream": true
}'

推荐学习

人工智能测试开发训练营，为大家提供全方位的人工智能测试知识和技能培训。行业专家授课，实战驱动，并提供人工智能答疑福利。内容包含ChatGPT与私有大语言模型的多种应用，人工智能应用开发框架 LangChain，视觉与图像识别自动化测试，人工智能产品质量保障与测试，知识图谱与模型驱动测试，深度学习应用，带你一站式掌握人工智能测试开发必备核心技能，快速提升核心竞争力！

本文标签：人工智能企业内部模型语言系统

版权声明：本文标题：人工智能 | 搭建企业内部的大语言模型系统内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://www.elefans.com/dianzi/1729815273a1213806.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

xp系统

电子爱好者 - 最新技术资讯及电子产品介绍！

人工智能 | 搭建企业内部的大语言模型系统

大纲

开源大语言模型

担心安全与隐私？可私有部署的开源大模型

常用开源大模型列表

开源大模型分支

大语言模型管理

大语言模型管理工具

Ollama 速度最快的大语言模型管理工具

Ollama 的命令

大语言模型的前端

大语言模型的应用前端

ollama chatbot​​​​​​​

PrivateGPT 架构

PrivateGPT 环境准备​​​​​​​

ollama 部署方式​​​​​​​

setting-ollama.yaml​​​​​​​​​​​​​​

启动​​​​​​​

local 部署模式​​​​​​​

setting-local.yaml​​​​​​​

非私有 OpenAI-powered 部署​​​​​​​

setting-openai.yaml​​​​​​​

openai 风格的 API 调用

更多相关文章

重装ThinkBook 16p Win11系统详细教程

iMeta | 兰大张东等使用PhyloSuite进行分子系统发育及系统发育树的统计分析

云服务器应用镜像和系统镜像选哪个好?

京东云轻量云主机搭建WordPress个人博客系统教程（图文+视频）

redmi book 15pro 2023 ubuntu22系统网卡驱动

Ubuntu系统的有线网卡驱动问题

闲谈IPv6-Anycast以及在LinuxWin7系统上的Anycast配置

Elsevier 期刊投稿材料的准备 &amp; 系统投稿流程

win11系统，XShell一直无法连接至VMware虚拟机（问题详细以及解决方法）

win8系统下安装SQL2005(SQL Server 2005)图文教程

Win10+GTX1650显卡下安装Tensorflow-gpu1.14的踩坑过程及训练目标检测模型

为什么好多公司的开发语言从C#变成了Java？

centos7.5离线安装部署TiDB-6.5.0分布式系统

RL4RS，离线强化学习，无模型强化学习等等资源汇总

Linux动态频率调节系统CPUFreq之三：governor

Linux内核架构：动态频率调节系统CPUFreq

liunux 查看系统参数、网络参数的命令

【模型性能1-泛化原因分析】On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

全球电脑蓝屏！Windows 10系统崩溃引发世界级混乱

计算机主机启动不了系统怎么办,电脑蓝屏开不了机怎么办

发表评论

推荐文章

亚远景科技-ASPICE 4.0 二级 GP2.1.32.1.4 Determine和Identify资源的区别

大学学计算机7代i5够吗,学生党必看:最新的7代酷睿与Core i5相比该选谁?

Ubuntu22.04.2 LTS 安装网卡驱动

Stable diffusion安装踩坑(win&amp;Mac&amp;iOS)

tif怎么转换成jpg格式

热门文章

i5 12600k 和 i7 12700k差距大吗 i712700k和i512600k 对比哪个好

Docker for windows 10

MxNet（GPU版本）安装及相关配置（CUDA，CuDNN，Graphviz，d2l，虚拟环境配置）及最终建议

微信小程序调用科大讯飞语音文字转换无服务器代理解决方案

现在国内软件做的各顶个的像病毒！

流氓软件，你装了吗？

Python 打包项目生成exe文件大启动慢解决办法

全面解析：oa系统是什么？有哪些好用的oa系统

win7系统损坏无法开机_win7系统网络适配器无法启动怎么办

机器学习-44-Transfer Learning(迁移学习)

最新文章

服务器2008系统 stop c0000218,电脑蓝屏STOP:C0000218错误解决方法实记

关于“电脑空闲几分钟出现蓝屏（鼠标能动）、鼠标键盘无响应、硬盘灯长亮”的解决办法

windows蓝屏故障原因（已解答）

Win11 频繁蓝屏重启

win10蓝屏代码_一分钟教你看懂电脑蓝屏，避免故障再次发生！

一次电脑蓝屏NO_MORE_IRP_STACK_LOCATIONS的处理

CrowdStrike更新导致蓝屏事件

VirtualBox7.0.16的蓝屏大坑与ssh登陆ubuntu虚拟机的办法

首次为安装计算机做准备后蓝屏重启,新装电脑蓝屏死机自动重启，问题在哪里？...

计算机错误代码0x 00000006,什么原因造成了蓝屏 电脑蓝屏错误代码介绍

鼠害对计算机硬件的影响,电脑蓝屏对硬件损害大吗？

wegame启动cf蓝屏_Wegame蓝屏怎么解决-解决wegame运行蓝屏、游戏蓝屏的方法 - 河东软件园...

电脑蓝屏0x000024解决记录

服务器显示器蓝屏怎么办

计算机主机启动不了系统怎么办,电脑蓝屏开不了机怎么办

小米手机肿么还原时钟

ollama chatbot

PrivateGPT 环境准备

ollama 部署方式

setting-ollama.yaml

启动

local 部署模式

setting-local.yaml

非私有 OpenAI-powered 部署

setting-openai.yaml

Elsevier 期刊投稿材料的准备 & 系统投稿流程

Stable diffusion安装踩坑(win&Mac&iOS)

计算机错误代码0x 00000006,什么原因造成了蓝屏电脑蓝屏错误代码介绍

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改官方免费下载