Course

Deep Reinforcement Learning


This advanced course starts with a quick review of some deep learning architectures followed by an introduction to fundamental concepts of reinforcement learning (RL) that we illustrate with concrete examples. Next, we’ll explore the Bellman equation, policies, models, Q-learning, the SARSA algorithm, and temporal difference (TD) learning.

In this deep reinforcement learning (DRL) course, you will learn how to solve common tasks in RL, including some well-known simulations, such as CartPole, MountainCar, and FrozenLake. You will be introduced to concepts such as clipping regions and policy gradients, as well as an extensive collection of algorithms, including DQN, prioritized experience replay, DDQN, D4PG, A2C, PPO, TRPO, DDPG, A2C, and SAC.

Eventually the course introduces additional algorithms, such as ACER and ACTKR, as well as DRL libraries, such as Google Dopamine and Tensor Flow-Agents. In almost all cases, the code samples are written in TF2.Keras, along with a limited number of code samples in PyTorch. The development of a plethora of DRL algorithms has improved the accuracy of diverse areas, such as natural language processing and robotics. In addition, DRL-based systems represent the state-of-the-art in Go as well as highly sophisticated multi-player games (including StarCraft and Dota).

Topics Include:

  • Deep learning architectures
  • Markov decision processes
  • Reinforcement and deep reinforcement learning
  • Policy gradients and various algorithms
  • Proximal policy optimization
  • Various actor/critic algorithms
  • Deep RL libraries

Learning Outcomes:

At the conclusion of the course, the student should be able to:

  • Describe how a bi-LSTM differs from a standard LSTM
  • Explain how n-grams work
  • Describe the BERT architecture
  • Describe Q learning, models, and policies
  • Define the purpose of the Bellman equation
  • Discuss the advantages/disadvantages of reinforcement learning
  • Explain how the epsilon-greedy algorithm differs from a pure greedy algorithm
  • Discuss how deep learning enhances reinforcement learning
  • Describe GANs and how they pertain to autonomous vehicles


Prerequisites - Please note that this course covers advanced topics, and students are expected to have completed one of the prerequisite courses or have equivalent experience."

Sections Open for Enrollment:

Open Sections and Schedule
Start / End Date Units Location Cost Instructor
05-07-2020 to 07-09-2020 None ONLINE $1020

Oswald A Campesato

Enroll

Schedule

Date: Start Time: End Time: Meeting Type: Location:
Thu, 05-07-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 05-14-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 05-21-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 05-28-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 06-04-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 06-11-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 06-18-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 06-25-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 07-02-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE
Thu, 07-09-2020 6:30 p.m. 9:30 p.m. Live-Online ONLINE

Course Inquiry

Ask us any questions you may have about this course.

Contact Us
Speak to a student services representative.

Call (408) 861-3860

Envelope extension@ucsc.edu