Skip to content

Machine Learning

[Solved] RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Problem Description

When building deep learning models in PyTorch, adjusting the shapes of layers and input/output dimensions is something every AI engineer has to deal with. However, there is a small but interesting pitfall in the view() method of PyTorch:

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
Read More »[Solved] RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

[Machine Learning] Note of Rotary Position Embedding (RoPE)

Introduction

(Note: Since this article is imported from my personal Hackmd, some symbols and formatting might not display properly in WordPress. I appreciate your understanding, sorry for any inconvenience.)

RoPE is a method for introducing relative position information into the self-attention mechanism through absolute positional encoding.

Read More »[Machine Learning] Note of Rotary Position Embedding (RoPE)

[Paper Reading] Lifting the Curse of Multilinguality by Pre-training Modular Transformers

Cross-lingual Modular (X-Mod) is an interesting language model architecture that modularizes the parameters for different languages as Module Units, allowing the model to use separate parameters when fine-tuning for a new language, thereby (comparatively) avoiding the problem of catastrophic forgetting.

Read More »[Paper Reading] Lifting the Curse of Multilinguality by Pre-training Modular Transformers

[Paper Reading] RAGAS: Automated Evaluation of Retrieval Augmented Generation

Introduction

The year 2023 witnessed an explosion of generative AI technologies, with a myriad of applications emerging across various domains. In the field of Natural Language Processing (NLP), Large Language Models (LLMs) stand out as one of the most significant advancements. By training LLMs effectively and reducing hallucinations, they can significantly reduce human effort across a wide range of tasks.

Read More »[Paper Reading] RAGAS: Automated Evaluation of Retrieval Augmented Generation

Using CuPy to Accelerate Matrix Operations with GPU

Introduction

CuPy is an open-source GPU-accelerated numerical computation library designed for deep learning and scientific computing. It shares many of the same methods and functions as the popular NumPy package in Python but extends its capabilities to perform computations on the GPU. In short, tasks that can benefit from parallel computation on the GPU, such as matrix operations, can achieve significant acceleration with CuPy.

Read More »Using CuPy to Accelerate Matrix Operations with GPU
Exit mobile version