Practical_RL/week05_explore at master · kelvin34501/Practical_RL

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
deep_see.png		deep_see.png
q_learning_agent.py		q_learning_agent.py
replay_buffer.py		replay_buffer.py
und1.mp4		und1.mp4
und2.mp4		und2.mp4
week5.ipynb		week5.ipynb

README.md

Slides - here

Exploration and exploitation

[main] David Silver lecture on exploration and expoitation - video
Alternative lecture by J. Schulman - video
Alternative lecture by N. de Freitas (with bayesian opt) - video
Our lectures (russian)
- "mathematical" lecture (by Alexander Vorobev) '17 - slides, video
- "practical" lecture '18 - video
- Seminar - video

More materials

Gittins Index - the less heuristical approach to bandit exploration - article
"Deep" version: variational information maximizing exploration - video
- Same topics in russian - video
Lecture covering intrinsically motivated reinforcement learning - video
- Slides
- Same topics in russian - video
- Note: UCB-1 is not for bernoulli rewards, but for arbitrary r in [0,1], so you can just scale any reward to [0,1] to obtain a peace of mind. It's derived directly from Hoeffding's inequality.
Very interesting blog post written by Lilian Weng that summarises this week's materials: The Multi-Armed Bandit Problem and Its Solutions

Seminar

In this seminar, you'll be solving basic and contextual bandits with uncertainty-based exploration like Bayesian UCB and Thompson Sampling. You will also get acquainted with Bayesian Neural Networks.

Everything else is in the notebook :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week05_explore

week05_explore

README.md

Slides - here

Exploration and exploitation

More materials

Seminar

Files

week05_explore

Directory actions

More options

Directory actions

More options

Latest commit

History

week05_explore

Folders and files

parent directory

README.md

Slides - here

Exploration and exploitation

More materials

Seminar