Artificial Intelligence Archives - Page 2 of 6

34 views
9 minute read

Dynamic Programming, DP

ByWayne
12/12/2025

In Reinforcement Learning (RL), Dynamic Programming (DP) is the earliest and most complete solution framework. Although DP is almost impossible to apply directly to practical high-dimensional or continuous environments, it reveals the mathematical foundations of all core concepts in modern RL. At a fundamental level, the convergence objectives and update rules of all RL algorithms are derived from the Bellman Equations and the Generalized Policy Iteration (GPI) framework used in DP.

46 views
4 minute read

Generalized Policy Iteration, GPI

ByWayne
10/12/2025

Generalized Policy Iteration (GPI) is not a single algorithm, but rather the fundamental framework underlying all Reinforcement Learning (RL) methods. It integrates policy evaluation and policy improvement, enabling algorithms to steadily approach the optimal policy and the optimal state-value function even under limited information.

34 views
9 minute read

Markov Decision Process, MDP

ByWayne
09/12/2025

A Markov Decision Process (MDP) provides the rigorous mathematical foundation underlying all policy evaluation and policy improvement methods in Reinforcement Learning (RL). Through an MDP, we can formally describe the interaction between an agent and its environment, and define the value of a policy in terms of its expected return.

61 views
13 minute read

Chain-of-Thought, CoT

ByWayne
20/11/2025

The performance of LLMs on reasoning tasks has undergone substantial change in recent years with the introduction of Chain-of-Thought (CoT) prompting. This technique guides an LLM to produce step-by-step intermediate reasoning, enabling the model to exhibit a human-like structure of thought. As task complexity increases, however, the limitations of traditional CoT have become more apparent, motivating a series of follow-up methods designed to address these issues. This article presents an overview of CoT and its extensions.

86 views
10 minute read

Attention Mechanisms

ByWayne
15/09/2025

Attention models have become a central concept in modern neural networks. Notably, popular architectures such as GPT models and Vision Transformers (ViT) are representative applications of attention models. This article will delve into the key attention mechanisms that underlie these models.

82 views
11 minute read

Vision Transformer Model

ByWayne
29/06/2025

In the field of image recognition, Convolutional Neural Networks (CNNs) have long been the dominant architecture. In recent years, Transformer models have achieved great success in Natural Language Processing (NLP), which has led researchers to consider applying the Transformer architecture to image processing tasks. Vision Transformer (ViT) is a model designed for image understanding based on the Transformer framework.

Photo by Sestrjevitovschii Ina on Unsplash

85 views
4 minute read

Layer Normalization

ByWayne
17/06/2025

Normalization is a data transformation technique originating from statistics. It adjusts the mean and variance of data to make it more stable and predictable. In deep learning, normalization is widely used to improve the stability and efficiency of model training. This article explains the original concept of normalization, introduces the design and limitations of batch normalization, and explores how layer normalization addresses these issues to become a standard component in modern language models.

Photo by Koushik Chowdavarapu on Unsplash

132 views
6 minute read

Adam Optimizer

ByWayne
15/06/2025

When training neural networks, choosing a good optimizer is critically important. Adam is one of the most commonly used optimizers, so that it has almost become the default choice. Adam is built upon the foundations of SGD, Momentum, and RMSprop. By revisiting the evolution of these methods, we can better understand the principles behind Adam.

127 views
10 minute read

LoRA: Low-Rank Adaptation of Large Language Models

ByWayne
11/06/2025

When LLMs often have tens of billions of parameters, performing a single fine-tuning run can exhaust an entire GPU. LoRA (Low-Rank Adaptation of Large Language Models) offers a clever solution: instead of modifying the model’s original parameters directly, it learns new knowledge through low-rank matrices. This allows us to adapt the model’s behavior quickly and at very low cost, while still preserving its original performance.

86 views
5 minute read

CLIP Model

ByWayne
03/06/2025

CLIP (Contrastive Language-Image Pre-training) is a model proposed by OpenAI in 2021. It achieves strong generalization capability by integrating visual and language representations, and it has extensive potential applications. This article will introduce both the theory and practical implementation of CLIP.

Get source code of posts.

Artificial Intelligence

Dynamic Programming, DP

Generalized Policy Iteration, GPI

Markov Decision Process, MDP

Chain-of-Thought, CoT

Attention Mechanisms

Vision Transformer Model

Layer Normalization

Adam Optimizer

LoRA: Low-Rank Adaptation of Large Language Models

CLIP Model

Bradley-Terry Model

Entropy

Byte-Pair Encoding

Policy Gradient

On-Policy Control with Approximation

Spring Security JWT Authentication with Google Sign-In Explained

How to Backup and Restore MySQL Databases in Spring Boot

Sending Push Notifications Using FCM in Spring Boot

Python Pie/Donut/Sunburst Charts

Kotlin Coroutine Flow Tutorial

Spring Security JWT Authentication with Google Sign-In Explained

How to Backup and Restore MySQL Databases in Spring Boot

Sending Push Notifications Using FCM in Spring Boot

Python Pie/Donut/Sunburst Charts