当前位置: 首页 > news >正文

制作网站的收获体会手表网站建设策划

制作网站的收获体会,手表网站建设策划,集团网站建设调研报告,网页设计师培训费一、定义 gpu部署 vllm openai 请求 qwen 架构 训练数据格式 损失函数 训练demo 问题解决 二、实现 gpu 部署 https://www.modelscope.cn/models/Qwen/Qwen3-Embedding-8B/summary 注意: SentenceTransformer 部署时#xff0c;可能内存不足。采用transformer方式部…一、定义 gpu部署 vllm openai 请求 qwen 架构 训练数据格式 损失函数 训练demo 问题解决 二、实现 gpu 部署 https://www.modelscope.cn/models/Qwen/Qwen3-Embedding-8B/summary 注意: SentenceTransformer 部署时可能内存不足。采用transformer方式部署能自动讲模型、数据分配到其他显存卡上避免单卡不足的问题。 vllm 部署时使用脚本启动指定任务embed. vllm openai 请求 python -m vllm.entrypoints.openai.api_server \ --model Qwen3-Embedding-8B \ --task embed \ --tensor-parallel-size 4 \--host 0.0.0.0 \--port 8000 \--enforce_eager \--dtype float16 \--gpu_memory_utilization 0.35 \--served-model-name kejiqiyecurl http://localhost:8000/v1/embeddings \-H Content-Type: application/json \-d {model: kejiqiye,input: [What is the capital of China?,Explain gravity]}from openai import OpenAIclient OpenAI(base_urlhttp://localhost:8000/v1,api_keyEMPTY # vllm 不需要 key这里随便写 )resp client.embeddings.create(modelkejiqiye,input[What is the capital of China?,Explain gravity] )for item in resp.data:print(len(item.embedding)) # 向量维度 模型架构 模型架构为Qwen3Model Qwen3Model((embed_tokens): Embedding(151665, 4096)(layers): ModuleList((0-35): 36 x Qwen3DecoderLayer((self_attn): Qwen3Attention((q_proj): Linear(in_features4096, out_features4096, biasFalse)(k_proj): Linear(in_features4096, out_features1024, biasFalse)(v_proj): Linear(in_features4096, out_features1024, biasFalse)(o_proj): Linear(in_features4096, out_features4096, biasFalse)(q_norm): Qwen3RMSNorm((128,), eps1e-06)(k_norm): Qwen3RMSNorm((128,), eps1e-06))(mlp): Qwen3MLP((gate_proj): Linear(in_features4096, out_features12288, biasFalse)(up_proj): Linear(in_features4096, out_features12288, biasFalse)(down_proj): Linear(in_features12288, out_features4096, biasFalse)(act_fn): SiLU())(input_layernorm): Qwen3RMSNorm((4096,), eps1e-06)(post_attention_layernorm): Qwen3RMSNorm((4096,), eps1e-06)))(norm): Qwen3RMSNorm((4096,), eps1e-06)(rotary_emb): Qwen3RotaryEmbedding() )数据训练格式 #两种方式均可, 损失函数选择–loss_type infonce # sample without rejected_response {query: sentence1, response: sentence1-positive} # sample with multiple rejected_response {query: sentence1, response: sentence1-positive, rejected_response: [sentence1-negative1, sentence1-negative2, ...]}#其他—loss_type 也可以如cosine_similarity 格式但需要其对应的数据集格式。 损失函数 infonce loss信息噪声对比损失通过最大化正样本对的相似度并最小化负样本对的相似度帮助模型学习有效的特征表示。 核心机制​通过softmax函数将正样本对的相似度与所有样本对的相似度进行对比使正样本对的概率最大化。这一过程等效于优化互信息下界I(q; k^)从而让模型捕捉数据中的本质特征 损失函数demo: ms-swift→plugin/loss.py # split tensors into single sample, 将拼接的样本进行拆分# for example: batch_size2 with tensor anchor(1)positive(1)negatives(3) anchor(1)positive(1)negatives(2)# labels will be [1,0,0,0,1,0,0], meaning 1 positive, 3 negatives, 1 positive, 2 negatives split_tensors _parse_multi_negative_sentences(sentences, labels, hard_negatives) # negative numbers are equal# [B, neg2, D] sentences torch.stack(split_tensors, dim0) # [B, 1, D] * [B, neg1, D] similarity_matrix torch.matmul(sentences[:, 0:1], sentences[:, 1:].transpose(1, 2)) / temperature # The positive one is the first element labels torch.zeros(len(split_tensors), dtypetorch.int64).to(sentences.device) loss nn.CrossEntropyLoss()(similarity_matrix.squeeze(1), labels) #InfoNCE 就是把对比学习问题转化为分类问题再用交叉熵来优化。 outputs {last_hidden_state: tensor([[-0.0289, -0.0035, -0.0092, ..., 0.0090, 0.0110, -0.0197],[-0.0469, 0.0026, -0.0105, ..., 0.0029, -0.0238, 0.0090],[ 0.0062, 0.0334, -0.0051, ..., 0.0036, 0.0011, 0.0069],...,[-0.0938, -0.0581, -0.0106, ..., -0.0077, 0.0098, 0.0483],[-0.0378, -0.0742, -0.0081, ..., -0.0269, -0.0091, -0.0009],[-0.0055, -0.0125, -0.0046, ..., -0.0204, -0.0028, -0.0081]],dtypetorch.bfloat16, grad_fnDivBackward0)} labels tensor([1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.])训练demo https://github.com/QwenLM/Qwen3-Embedding/blob/main/docs/training/SWIFT.md 训练时将query、positive、negative 同时输入模型。如batch 4, 则输入batch batch*(query_num positive_numnegative) 数据流向 数据处理前 batch4:{anchor_input_ids: [100007, 67338, 117487, 36407, 104332, 100298, 101807, 102969, 103584, 99252, 11319, 151643], anchor_labels: [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 151643], anchor_length: 12, positive_input_ids: [30534, 101907, 114826, 9370, 100745, 117487, 3837, 100724, 108521, 104183, 3837, 106084, 104273, 99222, 100964, 104705, 50511, 101153, 117487, 3837, 104431, 114826, 102488, 99616, 1773, 151643], negative_input_ids: [[99545, 106033, 99971, 100673, 106033, 99971, 16530, 104460, 101548, 3837, 106581, 100673, 99545, 106033, 1773, 151643], [111474, 99545, 71138, 99769, 102119, 108132, 111474, 104569, 105419, 15946, 3837, 109108, 105802, 102988, 118406, 104250, 3837, 100631, 105625, 15946, 104760, 102268, 101199, 53153, 99250, 104155, 104460, 106293, 1773, 151643], [100141, 50404, 99514, 121664, 5373, 111741, 5373, 32463, 52510, 32463, 99971, 111177, 9370, 102685, 72990, 103239, 86402, 1773, 73157, 102131, 30440, 99330, 102354, 99971, 20, 15, 15, 15, 106301, 1773, 151643], [99752, 100471, 99252, 9370, 99252, 52129, 31914, 101400, 21287, 18493, 120375, 102176, 33108, 99243, 99742, 119998, 9370, 59956, 101719, 52853, 17447, 1773, 151643], [104332, 101002, 101408, 100518, 100439, 115979, 100470, 115509, 107549, 33108, 104697, 102013, 3837, 104431, 101807, 100517, 101304, 24968, 39352, 104222, 108623, 100417, 82224, 104397, 100417, 99430, 71817, 111936, 117705, 3837, 104216, 101254, 108069, 102756, 104332, 1773, 101899, 99522, 3837, 50511, 37029, 110606, 9370, 101320, 107696, 71138, 5373, 105409, 5373, 116473, 90395, 100638, 111848, 86119, 1773, 91572, 3837, 73670, 37029, 99420, 110819, 104459, 71817, 30844, 21894, 40916, 69905, 1773, 151643]], positive_labels: [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 151643], negative_labels: [[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 151643], [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 151643], [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 151643], [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 151643], [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 151643]], negative_loss_scale: [None, None, None, None, None], positive_length: 26, negative_length: [16, 30, 31, 23, 69], labels: [1.0, 0.0, 0.0, 0.0, 0.0, 0.0], length: 207} 模型输入 tensor([[ 18493, 30709, 100761, ..., 151643, 151643, 151643],[101138, 112575, 3837, ..., 151643, 151643, 151643],[ 71817, 100027, 27091, ..., 151643, 151643, 151643],...,[102354, 100174, 16530, ..., 151643, 151643, 151643],[105625, 9370, 99971, ..., 151643, 151643, 151643],[112646, 9370, 111858, ..., 151643, 151643, 151643]]) torch.Size([28, 101]) 损失函数输入 outputs {last_hidden_state: tensor([[-0.0289, -0.0035, -0.0092, ..., 0.0090, 0.0110, -0.0197],[-0.0469, 0.0026, -0.0105, ..., 0.0029, -0.0238, 0.0090],[ 0.0062, 0.0334, -0.0051, ..., 0.0036, 0.0011, 0.0069],...,[-0.0938, -0.0581, -0.0106, ..., -0.0077, 0.0098, 0.0483],[-0.0378, -0.0742, -0.0081, ..., -0.0269, -0.0091, -0.0009],[-0.0055, -0.0125, -0.0046, ..., -0.0204, -0.0028, -0.0081]],dtypetorch.bfloat16, grad_fnDivBackward0)} labels tensor([1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.]) 数据demo—qwen3_emb.json {query: 如何培养一缸清澈透明、无异味且水质稳定的嫩绿水, response: 培养嫩绿水的关键因素有光线、营养和日常管理。需要选择适当的光线合理调整养殖密度适中地换水并每天目测水质变化及时应对问题。光线选择要适度散光太强要有遮蔽物光线太弱可以使用较好的照明设备。营养管理要合理放养换水次数要适中。管理要到位确保每天观察水质的物理及生物因素变化。, rejected_response: [因为插穗未生根之前必须保证插穗鲜嫩能进行光合作用以制造生根物质同时减少插穗的水分蒸发。, 绿茶属于不发酵茶其制成品的色泽和冲泡后的茶汤较多地保存了鲜茶叶的绿色格调。, 水滴鱼的身体呈凝胶果冻状全身缺少肌肉表皮略带粉色密度比海水低所以能够漂浮在水中它没有鱼鳔主要依靠一身的凝胶物来平衡水密度和抗压强。, 顺德筲箕鱼的特点是按鱼身不同部分安排不同调味并使用竹藤编制的筲箕蒸制使鱼肉达到一鱼三味的效果。, 采取一切有利于幼苗生长的措施,提高幼苗生存率。这一时期,水分是决定幼苗成活的关键因子。]} 训练指令 INFONCE_MASK_FAKE_NEGATIVEtrueswift sft \--model dir1 \--task_type embedding \--model_type qwen3_emb \--train_type lora \--dataset qwen3_emb_train.json \--split_dataset_ratio 0.05 \--eval_strategy steps \--output_dir output \--eval_steps 100 \--num_train_epochs 1 \--save_steps 100 \--per_device_train_batch_size 4 \--per_device_eval_batch_size 4 \--gradient_accumulation_steps 4 \--learning_rate 6e-6 \--loss_type infonce \--label_names labels \--dataloader_drop_last truecpu 训练指令 swift sft --model dir1 --task_type embedding --model_type qwen3_emb --train_type lora --dataset qwen3_emb_train.json --split_dataset_ratio 0.05 --eval_strategy steps --output_dir output --eval_steps 100 --num_train_epochs 1 --save_steps 100 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --gradient_accumulation_steps 4 --learning_rate 6e-6 --loss_type infonce --label_names labels --dataloader_drop_last true --fp16 false --bf16 false --use_cpu 问题解决 [2025-09-23 08:26:21] 45fb934b8545:5358:5461 [1] misc/shmutils.cc:87 NCCL WARN Error: failed to extend /dev/shm/nccl-vCZC3l to 33030532 bytes, error: No space left on device (28) [2025-09-23 08:26:21] 45fb934b8545:5358:5461 [1] misc/shmutils.cc:129 NCCL WARN Error while creating shared memory segment /dev/shm/nccl-vCZC3l (size 33030528), error: No space left on device (28) [2025-09-23 08:26:21] 45fb934b8545:5359:5455 [2] misc/shmutils.cc:87 NCCL WARN Error: failed to extend /dev/shm/nccl-Wvu7fr to 33030532 bytes, error: No space left on device (28) [2025-09-23 08:26:21] 45fb934b8545:5357:5457 [0] misc/shmutils.cc:87 NCCL WARN Error: failed to extend /dev/shm/nccl-TM58ig to 33030532 bytes, error: No space left on device (28) [2025-09-23 08:26:21] 45fb934b8545:5360:5459 [3] misc/shmutils.cc:87 NCCL WARN Error: failed to extend /dev/shm/nccl-kIuUhv to 33030532 bytes, error: No space left on device (28) 45fb934b8545:5358:5461 [1] NCCL INFO proxy.cc:1336 - 2 [2025-09-23 08:26:21] 45fb934b8545:5359:5455 [2] misc/shmutils.cc:129 NCCL WARN Error while creating shared memory segment /dev/shm/nccl-Wvu7fr (size 33030528), error: No space left on device (28) 45fb934b8545:5359:5455 [2] NCCL INFO proxy.cc:1336 - 2 [2025-09-23 08:26:21] 45fb934b8545:5357:5457 [0] misc/shmutils.cc:129 NCCL WARN Error while creating shared memory segment /dev/shm/nccl-TM58ig (size 33030528), error: No space left on device (28) [2025-09-23 08:26:21] 45fb934b8545:5360:5459 [3] misc/shmutils.cc:129 NCCL WARN Error while creating shared memory segment /dev/shm/nccl-kIuUhv (size 33030528), error: No space left on device (28) docker 内使用多卡时容易共享内存不足导致vllm 无法启动。 因此重新创建docker, 增大共享内存。 docker run -it --gpus all --name common_test1 --shm-size24g -p38015:8014 -p38015:8013 -v /home/jyf:/home/ --privilegedtrue tone.tcl.com/devops/docker/release/ops/pytorch:2.1.0-cuda12.1-cudnn8-devel /bin/bash
http://www.pierceye.com/news/443222/

相关文章:

  • 各大网站平台发布信息山亭网站建设
  • 做网站.服务器怎么买公司网站如何上传视频
  • 广州建设工程造价信息网长春百度网站优化
  • 郑州外贸网站建设公司价格wordpress禁止百度抓取
  • 临沂建站程序衡阳网站建设ss0734
  • 开发软件下载网站备案号放网站下面居中
  • 开封网站网站建设有哪些好的模板网站
  • 专业做蛋糕视频网站网站目录怎么做301跳转
  • 白城网站建设网络营销顾问培训
  • 沈阳网站开发培训多少钱百度收录批量提交入口
  • php做的网站怎么入侵wordpress插件安装教程
  • 网站 免费 认证58同城东莞招聘
  • 大兴网站建设服务公司石家庄建站
  • 怎么给公司做个网站wordpress h1标签
  • 电子商务网站设计的原则wordpress 图片 不显示缩略图
  • 网站设计制作开发更改网站名称
  • 兰州构建公司优化网站佛山厂商
  • 外贸网站建设需要多少钱it行业软件开发
  • 手机网站开发哪个好兰州哪家网站做推广效果好
  • 南宁定制建站学生做义工网站
  • 开阳县城乡建设局网站sae 部署wordpress
  • 360免费建站怎么样修改网站图标
  • 心理咨询网站模板国税网站页面建设中
  • 网站查询工信部深圳保障性住房统一网
  • 个人网站建设的目的免费编程软件下载
  • 潍坊网站建设优化推广彩页设计公司
  • 海洋网站建设网络钓鱼网站链接
  • 网站界面设计尺寸规范wordpress清理网站缓存
  • ios开发者网站生鲜电商网站建设
  • 域名网站备案查询行业网站名称