Natural Language Processing Archives

47 views
8 minute read

Byte-Pair Encoding

ByWayne
07/01/2026

Byte-Pair Encoding (BPE) is a frequency-based symbol merging algorithm that was originally proposed as a data compression method. In natural language processing (NLP), BPE has been reinterpreted as a subword tokenization technique that strikes a balance between characters and full words. By automatically learning high-frequency fragments from data, BPE can construct a scalable vocabulary effectively without relying on any language-specific knowledge.

61 views
13 minute read

Chain-of-Thought, CoT

ByWayne
20/11/2025

The performance of LLMs on reasoning tasks has undergone substantial change in recent years with the introduction of Chain-of-Thought (CoT) prompting. This technique guides an LLM to produce step-by-step intermediate reasoning, enabling the model to exhibit a human-like structure of thought. As task complexity increases, however, the limitations of traditional CoT have become more apparent, motivating a series of follow-up methods designed to address these issues. This article presents an overview of CoT and its extensions.

85 views
10 minute read

Attention Mechanisms

ByWayne
15/09/2025

Attention models have become a central concept in modern neural networks. Notably, popular architectures such as GPT models and Vision Transformers (ViT) are representative applications of attention models. This article will delve into the key attention mechanisms that underlie these models.

127 views
10 minute read

LoRA: Low-Rank Adaptation of Large Language Models

ByWayne
11/06/2025

When LLMs often have tens of billions of parameters, performing a single fine-tuning run can exhaust an entire GPU. LoRA (Low-Rank Adaptation of Large Language Models) offers a clever solution: instead of modifying the model’s original parameters directly, it learns new knowledge through low-rank matrices. This allows us to adapt the model’s behavior quickly and at very low cost, while still preserving its original performance.

322 views
15 minute read

Generative Pre-trained Transformer, GPT

ByWayne
23/04/2025

Over the past decade in the field of Natural Language Processing (NLP), the Generative Pre-trained Transformer (GPT) has undoubtedly been one of the most iconic technologies. GPT has not only redefined the approach to language modeling but also sparked a revolution centered around pre-training, leading to the rise of general-purpose language models. This article begins with an overview of the GPT architecture and delves into the design principles and technological evolution from GPT-1 to GPT-3.

Photo by Maarten van den Heuvel on Unsplash

327 views
12 minute read

Bidirectional Encoder Representations from Transformers, BERT

ByWayne
15/04/2025

Bidirectional Encoder Representations from Transformers (BERT) is a pre-training technology for natural language processing proposed by Google AI in 2018. BERT significantly advances the state of natural language processing by providing a deeper contextual understanding of language.

268 views
11 minute read

Transformer Model

ByWayne
03/04/2025

Transformer model was introduced by a team at Google Brain in 2017 and is a deep learning architecture that uses an attention mechanism. It solves significant challenges associated with traditional sequence namely capturing long-range dependencies and enabling more parallelizable computations.

268 views
10 minute read

Attention Models

ByWayne
19/03/2025

Attention mechanisms is a method in deep learning that lets a model focus on the most relevant parts of its input when producing each piece of its output. Unlike traditional sequence models that often struggle with longer inputs, attention allows models to dynamically focus on different parts of the input sequence when generating each part of the output sequence.

232 views
5 minute read

Sequence to Sequence Model (Seq2Seq)

ByWayne
15/03/2025

Sequence to Sequence (Seq2Seq) model is a neural network architecture that maps one sequence to another. It has revolutionized the field of Natural Language Processing (NLP), significantly enhancing the performance of tasks such as translation, text summarization, and chatbots. This article will dive deeply into the principles behind the Seq2Seq model.

270 views
10 minute read

Bi-directional Recurrent Neural Networks (BRNNs)

ByWayne
10/03/2025

Bi-directional recurrent neural betworks (BRNNs) are an extension of standard RNNs specifically designed to process sequential data in both forward and backward directions. Compared to traditional RNNs, BRNN architectures maintain more comprehensive context information, enabling them to capture useful dependencies across entire sequences for improved predictions in various natural language processing and speech recognition tasks.

Get source code of posts.

Byte-Pair Encoding

Natural Language Processing

Byte-Pair Encoding

Chain-of-Thought, CoT

Attention Mechanisms

LoRA: Low-Rank Adaptation of Large Language Models

Generative Pre-trained Transformer, GPT

Bidirectional Encoder Representations from Transformers, BERT

Transformer Model

Attention Models

Sequence to Sequence Model (Seq2Seq)

Bi-directional Recurrent Neural Networks (BRNNs)

Bradley-Terry Model

Entropy

Byte-Pair Encoding

Policy Gradient

On-Policy Control with Approximation

Spring Security JWT Authentication with Google Sign-In Explained

How to Backup and Restore MySQL Databases in Spring Boot

Sending Push Notifications Using FCM in Spring Boot

Python Pie/Donut/Sunburst Charts

Kotlin Coroutine Flow Tutorial

Spring Security JWT Authentication with Google Sign-In Explained

How to Backup and Restore MySQL Databases in Spring Boot

Sending Push Notifications Using FCM in Spring Boot

Python Pie/Donut/Sunburst Charts