Member-only story

Introduction To Reinforcement Learning Summary — Part 1

LZP Data Science
3 min readAug 28, 2022

--

  • Quantifying P(s, a, s′) where the probability of each state > action > new state.
  • The goal of reinforcement learning is for the algorithm to learn an optimal policy and take an optimal action when presented with state s.
  • Over time, we reinforce actions that lead to good outcomes and penalise actions that lead to poor outcomes.

Optimal Policy

  • It aims to maximise the average reward over time.
  • Should be able to account for the outcome and costs.
  • Non-myopic (Think about the immediate reward and also for future scenarios).
  • Near-term impacts have a heavier weight than longer-term scenarios.

Reinforcement Learning Solution Setup

  • Discretise each continuous value in the current state to n bins.
  • We apply the same discretisation logic for each action as well.

--

--

No responses yet