[Paper Reading] ENTP: ENCODER-ONLY NEXT TOKEN PREDICTION
The following are some points in this paper:
Read More »[Paper Reading] ENTP: ENCODER-ONLY NEXT TOKEN PREDICTIONThe following are some points in this paper:
Read More »[Paper Reading] ENTP: ENCODER-ONLY NEXT TOKEN PREDICTIONA multi-modal large language model (Multi-Modal Large Language Model) isn’t limited to text only. I know this might sound contradictory, but this is a term that has become widely accepted. What I want to document today is how to fine-tune a multi-modal model using a script.
Read More »Notes on Fine-Tuning a Multi-Modal Large Language Model Using SFTTrainer (Taking LLaVa-1.5 as an Example)This year, due to work, I tried annotating the data myself; it was only after diving into it personally that I truly understood just how profoundly training data affects an AI model.
Read More »“Common sense, as people call it, is merely the biases learned during youth”—the training data for AI models is no differentIn the process of training and fine-tuning deep neural networks, the most important and scarce resource is undoubtedly the GPU’s VRAM. Therefore, making every bit perform at its best is a critical task.
Read More »Differences in Precision Representations in Deep Learning: Float32, Float16, Float8, and BFloat16Recently, I’ve achieved some good application results by fine-tuning Gemma-2. However, I encountered various errors when deploying it on the client’s equipment, which was quite frustrating. Currently, there isn’t a systematic troubleshooting guide online, so I’m documenting it here.
Read More »Troubleshooting Accelerated Inference of Gemma-2 on V100 GPUs Using vLLMWhen applying Large Language Models (LLMs) in real-world scenarios, it’s often not just about letting the model generate text freely. We might want the model to return specific structures, such as multiple-choice questions or providing a rating. In such cases, transformers-based models can directly use the outlines tool.
Read More »Structuring Model Outputs Using the Outlines ToolCurrently, LLM services cover a wide range of fields, and Prompt Injection and Jailbreak threats to LLMs are growing by the day. A few months ago, a customer service LLM even provided incorrect information, leading to a loss of customer rights (although that wasn’t caused by a prompt attack).
Microsoft’s open-source BIPIA (Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models) evaluation method, although tested six months ago without significant updates since, remains a simple and convenient testing method for the tasks I have at hand.
Read More »Evaluating LLM Defense Capabilities Using the Microsoft BIPIA FrameworkCross-lingual Modular (X-Mod) is an interesting language model architecture that modularizes the parameters for different languages as Module Units, allowing the model to use separate parameters when fine-tuning for a new language, thereby (comparatively) avoiding the problem of catastrophic forgetting.
Read More »[Paper Reading] Lifting the Curse of Multilinguality by Pre-training Modular TransformersWhen using ComfyUI to generate images, we need to leverage the capabilities of various models to ultimately form a complete workflow. In other words, these so-called various models together constitute what we call Stable Diffusion. Today, I will introduce where to download these models.
Read More »Stable Diffusion ComfyUI Note 03 – How To Download SD ModelsI previously wrote a note introducing the vLLM accelerated inference framework (Using vLLM To Accelerate The Decoding Of Large Language Model), but due to space and time constraints, I couldn’t delve into more detailed features.
Read More »Using vLLM To Accelerate Inference Speed By Continuous Batching