Machine Learning

提示注入攻擊（prompt injection attack）的防禦筆記

Clay
2024-02-262024-02-26
Machine Learning

什麼是提示注入攻擊？

提示注入攻擊（prompt injection attack）算是一種新興的資安疑慮問題，主要是發生在大型語言模型（Large Language Model, LLM）或其他 AI 相關領域的攻擊形式。

[已解決] RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Clay
2024-02-222024-02-22
Machine Learning, PyTorch

問題描述

在使用 PyTorch 進行深度學習模型的建設時，我們免不了一次又一次地調整神經層與輸入輸出的形狀，這顯然是每位 AI 工程師必經的道路 —— 而在 PyTorch 的形狀變換 view() 方法中，顯然存在一個有趣的小陷阱：

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

[已解決] 使用 SFTTrainer 時，如果訓練資料中存在多個 response_template，會從何處開始計算 loss

Clay
2024-02-192024-04-01
Machine Learning

問題描述

SFTTrainer 是 HuggingFace 所提供的一個進行 LLM 微調任務的訓練工具，可以快速調整多項超參數與細項配置在大型語言模型的微調任務中。其中，response_template 是訓練資料中我們必須傳遞的特殊字串模板，在這個模板字串後的所有內容，都會在訓練時參與 loss 的計算。

[論文閱讀] ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

Clay
2024-01-312024-07-25
Machine Learning

Introduction

ColBERT 是一種專為檢索任務設計的 Embedding Model，它會將 Query 和 Documents 的 tokens 逐項轉換出 embedding 並計算最大相似度。

OpenAI Triton Note (2): Fused Softmax

Clay
2024-01-292024-01-29
Machine Learning, PyTorch

介紹

Softmax 是一個常見的激活函數（activation function），也經常被用作多分類的最後一層。

OpenAI Triton Note (1): 向量相加

Clay
2024-01-282024-01-29
Machine Learning, PyTorch

介紹

Triton 是一套開源的 GPU 程式語言編譯器，由 OpenAI 於 2021 年發佈，近年來有越來越多的開發使用 Triton 來編寫與優化在 GPU 上的併行程式。相較傳統 CUDA/OpenCL 等函式庫，Triton 提供了一種 Python-like 語法，顯得更清晰與容易上手。

[論文閱讀] RAGAS: Automated Evaluation of Retrieval Augmented Generation

Clay
2024-01-172024-07-25
Machine Learning, PyTorch

前言

2023 年是生成式 AI 大爆發的一年，各式各樣的 AI 應用層出不窮。其中在自然語言處理（NLP）領域中，大型語言模型（Large Language Model, LLM）絕對是最重要的技術。只要把 LLM 訓練好、減少幻覺，就會在各式各樣的任務上減少人力。

使用 vLLM 作為動態批次（Dynamic Batching）加速推理的 API 服務

Clay
2024-01-112024-01-11
Machine Learning

介紹

我之前曾寫了一篇介紹 vLLM 加速推理框架的筆記（使用 vLLM 進行大型語言模型（LLM）的高速推理），然而因受篇幅與時間限制，沒來得及探討更細緻的功能。

Supervised Fine-tuning Trainer (SFTTrainer) 訓練筆記

Clay
2024-01-032024-01-03
2 Comments
Machine Learning, PyTorch

… Read More »Supervised Fine-tuning Trainer (SFTTrainer) 訓練筆記

[已解決] Mistral 經過 SFTTrainer 微調後不會輸出 eos_token `<|im_end|>`

Clay
2023-12-312024-02-20
Machine Learning, PyTorch

問題描述

HuggingFace 之前曾經發表過文章表示現在的 LLM最好是依照 ChatML 格式去訓練，在一般情況下，會按照 system、user、assistant 的三種不同角色來進行生成，格式如下：

« 上一頁
1
...
4
5
6
7
8
...
17
下一頁 »