AI

[Paper Reading] ENTP: ENCODER-ONLY NEXT TOKEN PREDICTION

Clay
2024-10-162024-10-16
AI, Machine Learning, Papers

The following are some points in this paper:

Notes on Fine-Tuning a Multi-Modal Large Language Model Using SFTTrainer (Taking LLaVa-1.5 as an Example)

Clay
2024-10-082024-10-08
AI, Machine Learning, PyTorch

A multi-modal large language model (Multi-Modal Large Language Model) isn’t limited to text only. I know this might sound contradictory, but this is a term that has become widely accepted. What I want to document today is how to fine-tune a multi-modal model using a script.

an artist s illustration of artificial intelligence ai this image represents how machine learning is inspired by neuroscience and the human brain it was created by novoto studio as par

“Common sense, as people call it, is merely the biases learned during youth”—the training data for AI models is no different

Clay
2024-10-062024-10-06
AI

This year, due to work, I tried annotating the data myself; it was only after diving into it personally that I truly understood just how profoundly training data affects an AI model.

Differences in Precision Representations in Deep Learning: Float32, Float16, Float8, and BFloat16

Clay
2024-09-252024-09-25
AI, Machine Learning

In the process of training and fine-tuning deep neural networks, the most important and scarce resource is undoubtedly the GPU’s VRAM. Therefore, making every bit perform at its best is a critical task.

Troubleshooting Accelerated Inference of Gemma-2 on V100 GPUs Using vLLM

Clay
2024-09-142024-09-14
AI, Machine Learning

Problem Description

Recently, I’ve achieved some good application results by fine-tuning Gemma-2. However, I encountered various errors when deploying it on the client’s equipment, which was quite frustrating. Currently, there isn’t a systematic troubleshooting guide online, so I’m documenting it here.

Structuring Model Outputs Using the Outlines Tool

Clay
2024-09-032024-09-03
AI, Machine Learning

When applying Large Language Models (LLMs) in real-world scenarios, it’s often not just about letting the model generate text freely. We might want the model to return specific structures, such as multiple-choice questions or providing a rating. In such cases, transformers-based models can directly use the outlines tool.

Evaluating LLM Defense Capabilities Using the Microsoft BIPIA Framework

Clay
2024-08-302024-08-30
AI, Machine Learning

Currently, LLM services cover a wide range of fields, and Prompt Injection and Jailbreak threats to LLMs are growing by the day. A few months ago, a customer service LLM even provided incorrect information, leading to a loss of customer rights (although that wasn’t caused by a prompt attack).

Microsoft’s open-source BIPIA (Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models) evaluation method, although tested six months ago without significant updates since, remains a simple and convenient testing method for the tasks I have at hand.

[Paper Reading] Lifting the Curse of Multilinguality by Pre-training Modular Transformers

Clay
2024-08-132024-08-19
AI, Machine Learning

Cross-lingual Modular (X-Mod) is an interesting language model architecture that modularizes the parameters for different languages as Module Units, allowing the model to use separate parameters when fine-tuning for a new language, thereby (comparatively) avoiding the problem of catastrophic forgetting.

Stable Diffusion ComfyUI Note 03 – How To Download SD Models

Clay
2024-08-122024-08-12
AI

When using ComfyUI to generate images, we need to leverage the capabilities of various models to ultimately form a complete workflow. In other words, these so-called various models together constitute what we call Stable Diffusion. Today, I will introduce where to download these models.

Using vLLM To Accelerate Inference Speed By Continuous Batching

Clay
2024-07-312024-07-31
AI, Machine Learning

Introduction

I previously wrote a note introducing the vLLM accelerated inference framework (Using vLLM To Accelerate The Decoding Of Large Language Model), but due to space and time constraints, I couldn’t delve into more detailed features.