Clay
[已解決] Mistral 經過 SFTTrainer 微調後不會輸出 eos_token `<|im_end|>`
問題描述
HuggingFace 之前曾經發表過文章表示現在的 LLM最好是依照 ChatML 格式去訓練,在一般情況下,會按照 system、user、assistant 的三種不同角色來進行生成,格式如下:
Read More »[已解決] Mistral 經過 SFTTrainer 微調後不會輸出 eos_token `<|im_end|>`[已解決][Linux] /bin/bash: warning: shell level (1000) too high, resetting to 1
問題描述
/bin/bash: warning: shell level (1000) too high, resetting to 1
Read More »[已解決][Linux] /bin/bash: warning: shell level (1000) too high, resetting to 1LLM 微調筆記 - SFT 和 DPO 的差異
介紹
在大型語言模型(Large Language Model, LLM)的微調任務中,監督式微調(Supervised Fine-tuning, SFT)、基於人類反饋強化學習(Reinforcement Learning from Human Feedback, RLHF)和直接偏好優化(DPO)... 等等都是不錯的方法,不過他們之間存在一些差異。
Read More »LLM 微調筆記 - SFT 和 DPO 的差異Direct Preference Optimization (DPO) 訓練方法筆記
介紹
DPO(Direct Preference Optimization, 直接偏好優化)是一種取代 RLHF(Reinforcement Learning from Human Feedback, 基於人類反饋的強化學習)的微調方式。眾所皆知,大型語言模型在經過非監督式學習後能夠學習到大量的知識與理解能力(有些研究者認為是『壓縮並保存』了知識在神經網路權重中);在監督式學習後學會了流暢地回應我們的問題,或者說是學會了『對話』的能力。
Read More »Direct Preference Optimization (DPO) 訓練方法筆記[已解決] fatal error: portaudio.h: No such file or directory 9 | #include "portaudio.h" | ^~~~~~~~~~~~~ compilation terminated
問題描述
今天當我在一台新的 Linux 筆電上想要安裝 pyaudio(Python 中經常用於錄音的套件)時,我遇到了之前沒有遇過的錯誤:
Read More »[已解決] fatal error: portaudio.h: No such file or directory 9 | #include "portaudio.h" | ^~~~~~~~~~~~~ compilation terminatedLeetCode: 1637 Widest Vetical Area Between Two Points Containing No Points 解題紀錄
題目
Given n
points
on a 2D plane where points[i] = [xi, yi]
, Return the widest vertical area between two points such that no points are inside the area.
LeetCode: 2706-Buy Two Chocolates 解題紀錄
題目
You are given an integer array prices
representing the prices of various chocolates in a store. You are also given a single integer money
, which represents your initial amount of money.
LeetCode: 661-Image Smoother 解題紀錄
題目
An image smoother is a filter of the size 3 x 3
that can be applied to each cell of an image by rounding down the average of the cell and the eight surrounding cells (i.e., the average of the nine cells in the blue smoother). If one or more of the surrounding cells of a cell is not present, we do not consider it in the average (i.e., the average of the four cells in the red smoother).