Clay

“Common sense, as people call it, is merely the biases learned during youth”—the training data for AI models is no different

Clay
2024-10-062024-10-06
AI

This year, due to work, I tried annotating the data myself; it was only after diving into it personally that I truly understood just how profoundly training data affects an AI model.

Clay
2024-10-042024-10-04
Python

Introduction

Recently, while handling some work-related matters, I noticed that the client might potentially need a way to extract text from PPT files. I discussed this with the PM and my supervisor, and they mentioned that the client could simply copy the text from the PPT slides manually. Unless the client explicitly requests us to extract it programmatically.

Clay
2024-10-032024-10-03
Machine Learning, Python, Scikit Learn

The first time I heard about Vector Quantization (VQ) was from a friend who was working on audio processing, which gave me a vague understanding that VQ is a technique used for data feature compression and representation. At that time, I still wasn’t clear on how it differed from dimensionality reduction techniques like PCA.

Clay
2024-10-022024-10-02
Linux

batcat or simply bat, is a replacement tool for the cat command. It retains the functionality of cat for displaying files, while also highlighting keywords in code or configuration files, making it more convenient for developers to browse daily tasks or code files (thus, it’s definitely a productivity tool!).

Clay
2024-09-292024-09-29
Linux

Ripgrep (rg) is a command-line tool used for quickly searching file contents, designed as a replacement for grep, addressing grep‘s efficiency issues with large-scale file searches.

Clay
2024-09-282024-09-28
Linux

man is the traditional documentation tool for UNIX/Linux systems, but the detailed nature of its output can be overwhelming for users who just want a quick reference on how to use a command. Therefore, a simplified version called tldr was created (short for “too long, didn’t read”). It focuses on providing concise, easy-to-understand command documentation.

Clay
2024-09-272024-09-27
Linux

I’ve been looking for a more visually appealing alternative to htop for a long time. A few years ago, during a gathering with friends, I happened to pull out my laptop to fix a docker segmentation fault issue in the lab. One of my friends saw my htop and remarked, “So primitive~ Engineers are so boring~” I still hold a grudge for that (just kidding, of course).

Clay
2024-09-262024-09-26
Linux

Introduction

viddy is a tool similar to watch for running a command at regular intervals in a Linux terminal and displaying the output.

Clay
2024-09-252024-09-25
AI, Machine Learning

In the process of training and fine-tuning deep neural networks, the most important and scarce resource is undoubtedly the GPU’s VRAM. Therefore, making every bit perform at its best is a critical task.

Clay
2024-09-192024-09-19
Linux

Linux has so many useful tools, and I truly want to document every single one of them. To celebrate Linux reaching a usage rate of 4.55% on StatCounter (2024-09-18), I’ve decided to document another tool recommended by a colleague—the fuck command.

« Previous
1
…
3
4
5
6
7
…
82
Next »

Clay

“Common sense, as people call it, is merely the biases learned during youth”—the training data for AI models is no different

[Python] Extracting Text from PPT Using the python-pptx Library

Introduction

[Machine Learning] Vector Quantization (VQ) Notes

[Linux] Use the batcat Command as a Replacement for cat, Highlighting Code or Configurations

[Linux] Ripgrep (rg): A Super Fast File Search Tool

[Linux] TL;DR: Replace man with tldr to Read Command Line Manuals

[Linux] bpytop: A More Modern and Visually Appealing Resource Monitoring Tool Compared to htop

[Linux] viddy: An Enhanced Version of the Watch Command

Introduction

Differences in Precision Representations in Deep Learning: Float32, Float16, Float8, and BFloat16

[Linux] Using “The Fuck” Tool to Correct Mistyped Commands with fuck