Posts

Showing posts with the label ReinforceLearning

1. ReinforceLearning 101 (Bandit Problem)

Image
Reinforcement Learning The agent learns the optimal solution by interacting with the environment itself. Types of Machine Learning Supervised Learning The answer values are labeled → Equivalent to knowing the optimal solution in advance. Unsupervised Learning Mainly used to find structures or patterns in data without answer labels. Clustering, feature extraction, dimensionality reduction, etc. Reinforcement Learning The agent , which is the subject of action, is placed in an environment , observes the state of the environment, and takes an action suitable for the state. As a result of the action, the environment changes, and at this time, it receives a reward and simultaneously observes a new state . In this process, the agent learns how to maximize the total sum of rewards it receives. Repetition of State → Action → Reward Reinforcement learning receives rewards as feedback on the environment, which is different in nature from answer labels in supervised learning. In supervised ...