Skip to content

AI

Troubleshooting Accelerated Inference of Gemma-2 on V100 GPUs Using vLLM

Problem Description

Recently, I've achieved some good application results by fine-tuning Gemma-2. However, I encountered various errors when deploying it on the client's equipment, which was quite frustrating. Currently, there isn't a systematic troubleshooting guide online, so I'm documenting it here.

Read More »Troubleshooting Accelerated Inference of Gemma-2 on V100 GPUs Using vLLM

Evaluating LLM Defense Capabilities Using the Microsoft BIPIA Framework

Currently, LLM services cover a wide range of fields, and Prompt Injection and Jailbreak threats to LLMs are growing by the day. A few months ago, a customer service LLM even provided incorrect information, leading to a loss of customer rights (although that wasn't caused by a prompt attack).

Microsoft's open-source BIPIA (Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models) evaluation method, although tested six months ago without significant updates since, remains a simple and convenient testing method for the tasks I have at hand.

Read More »Evaluating LLM Defense Capabilities Using the Microsoft BIPIA Framework

[Paper Reading] Lifting the Curse of Multilinguality by Pre-training Modular Transformers

Cross-lingual Modular (X-Mod) is an interesting language model architecture that modularizes the parameters for different languages as Module Units, allowing the model to use separate parameters when fine-tuning for a new language, thereby (comparatively) avoiding the problem of catastrophic forgetting.

Read More »[Paper Reading] Lifting the Curse of Multilinguality by Pre-training Modular Transformers

Use Text To Retrieve Images: Introduction Of Multi-Modals ColPali

Introduction

Since last year, I have been filled with enthusiasm and curiosity about Multi-Modal AI models. As a staunch advocate of AGI, I believe that AI's current potential has not yet reached its ceiling. One significant bottleneck and research direction in AI today is naturally the integration of various modalities (text, images, audio...) in model applications.

Read More »Use Text To Retrieve Images: Introduction Of Multi-Modals ColPali

Meta-llama--Prompt-Guard-86M: Open-Source Model for Prompt Protection, Detecting Malicious Attacks

Recently, Meta AI has released various versions of Llama 3.1 (405B, 70B, 8B), with the 405B model being particularly noteworthy. It's the first time an open-source LLM has caught up with closed-source models like GPT-4 and Claude-3.5. At the same time, Meta AI has also released a smaller model called Prompt-Guard-86M.

Read More »Meta-llama--Prompt-Guard-86M: Open-Source Model for Prompt Protection, Detecting Malicious Attacks

Stable Diffusion ComfyUI Note 02 - Build The Basic Workflow

Introduction

Previously, we finished the configuration of ComfyUI, now we can try to build a basic and simplest workflow. The workflow is the most different point with stable-diffusion-webui. ComfyUI uses a card-based process that makes it easier to understand how the Stable Diffusion model actually performs inference and also makes it easier to customize and achieve more advanced effects.

Read More »Stable Diffusion ComfyUI Note 02 - Build The Basic Workflow