Machine Learning

[PyTorch] Using SDPA in 2.0+ to Improve the Computation Speed of Transformer’s Self-Attention Mechanism

Clay
2024-08-152024-08-16
Machine Learning, PyTorch

SDPA Introduction

Scaled Dot-Product Attention (SDPA) might immediately pop into the minds of those familiar with the Transformer self-attention mechanism:

Clay
2024-08-132024-08-19
AI, Machine Learning

Cross-lingual Modular (X-Mod) is an interesting language model architecture that modularizes the parameters for different languages as Module Units, allowing the model to use separate parameters when fine-tuning for a new language, thereby (comparatively) avoiding the problem of catastrophic forgetting.

Clay
2024-08-102024-08-10
Machine Learning, PyTorch

Introduction

The year 2023 witnessed an explosion of generative AI technologies, with a myriad of applications emerging across various domains. In the field of Natural Language Processing (NLP), Large Language Models (LLMs) stand out as one of the most significant advancements. By training LLMs effectively and reducing hallucinations, they can significantly reduce human effort across a wide range of tasks.

Clay
2024-08-022024-08-02
Machine Learning, PyTorch

Introduction

Supervised Fine-Tuning (SFT) is one of the most well-known methods for training Large Language Models (LLM). Essentially, it is similar to traditional language modeling, where the model learns certain knowledge through training data.

Clay
2024-08-022024-08-02
2 Comments
Machine Learning

Introduction

In the fine-tuning tasks of Large Language Models (LLM), several methods such as Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Direct Preference Optimization (DPO) are all viable approaches. However, there are some differences among them.

Clay
2024-07-312024-07-31
AI, Machine Learning

Introduction

I previously wrote a note introducing the vLLM accelerated inference framework (Using vLLM To Accelerate The Decoding Of Large Language Model), but due to space and time constraints, I couldn’t delve into more detailed features.

Clay
2024-07-312024-07-31
AI, Machine Learning

Introduction

HuggingFace’s Text Generation Inference (TGI) is a framework specifically designed to deploy and accelerate LLM inference services. Below is its architecture diagram:

Clay
2024-07-302024-07-31
AI, Machine Learning

Introduction

Since last year, I have been filled with enthusiasm and curiosity about Multi-Modal AI models. As a staunch advocate of AGI, I believe that AI’s current potential has not yet reached its ceiling. One significant bottleneck and research direction in AI today is naturally the integration of various modalities (text, images, audio…) in model applications.

Clay
2024-07-292024-07-29
AI, Machine Learning

Recently, Meta AI has released various versions of Llama 3.1 (405B, 70B, 8B), with the 405B model being particularly noteworthy. It’s the first time an open-source LLM has caught up with closed-source models like GPT-4 and Claude-3.5. At the same time, Meta AI has also released a smaller model called Prompt-Guard-86M.

Clay
2024-07-222024-07-22
Linux, Machine Learning, Python

Introduction

HuggingFace Model Hub is now a widely recognized and essential open-source platform for every one. Every day, countless individuals and organizations upload their latest trained models (including those for text, images, speech, and other domains) to this platform. It can be said that anyone working in AI-related fields frequently browses the HuggingFace platform website.

« Previous
1
…
3
4
5
6
7
…
16
Next »

Machine Learning

[PyTorch] Using SDPA in 2.0+ to Improve the Computation Speed of Transformer’s Self-Attention Mechanism

SDPA Introduction

[Paper Reading] Lifting the Curse of Multilinguality by Pre-training Modular Transformers

[Paper Reading] RAGAS: Automated Evaluation of Retrieval Augmented Generation

Introduction

Supervised Fine-tuning Trainer (SFTTrainer) Note

Introduction

LLM Fine-tuning Note – Differences Between SFT and DPO

Introduction

Using vLLM To Accelerate Inference Speed By Continuous Batching

Introduction

Note Of HuggingFace Text Generation Inference (TGI)

Introduction

Use Text To Retrieve Images: Introduction Of Multi-Modals ColPali

Introduction

Meta-llama–Prompt-Guard-86M: Open-Source Model for Prompt Protection, Detecting Malicious Attacks

Use `snapshot_download` To Download The Models Of HuggingFace Hub

Introduction