Skip to content
Pablo Rodriguez

State Of Reinforcement

“Reinforcement learning is an exciting set of technologies. In fact, when I was working on my PhD thesis reinforcement learning was the subject of my thesis. So I was and still am excited about these ideas.”

“Despite all the research momentum and excitement behind reinforcement learning though, I think there is a bit or maybe sometimes a lot of hype around it. So what I hope to do is share with you a practical sense of where reinforcement learning is today in terms of its utility for applications.”

“One of the reasons for some of the hype about reinforcement learning is, it turns out many of the research publications have been on simulated environments.”

Personal Experience: “Having worked in both simulations and on real robots myself, I can tell you that it’s much easier to get a reinforcement learning album to work in a simulation or in a video game than in a real robot.”

Common Developer Feedback: “A lot of developers have commented that even after they got it to work in simulation, it turned out to be surprisingly challenging to get something to work in the real world or the real robot.”

“Despite all the media coverage about reinforcement learning, today there are far fewer applications of reinforcement learning than supervised and unsupervised learning.”

Probability Assessment: “If you are building a practical application, the odds that you will find supervised learning or unsupervised learning useful or the right tool for the job, is much higher than the odds that you would end up using reinforcement learning.”

Comparative Experience: “I have used reinforcement learning a few times myself especially on robotic control applications, but in my day to day applied work, I end up using supervised and unsupervised learning much more.”

“There is a lot of exciting research in reinforcement learning right now, and I think the potential of reinforcement learning for future applications is very large.”

“Reinforcement learning still remains one of the major pillars of machine learning.”

Framework Benefits: “Having it as a framework as you develop your own machine learning algorithms, I hope will make you more effective at building working machine learning systems as well.”

Appropriate Applications:

  • Robotic control tasks
  • Game playing scenarios
  • Sequential decision-making problems
  • Applications where reward functions are easier to specify than direct policies

Better Alternatives Available:

  • Static prediction problems → Supervised learning
  • Pattern discovery → Unsupervised learning
  • Most business applications → Traditional ML approaches

“So I hope you’ve enjoyed this week’s materials on reinforcement learning, and specifically I hope you have fun getting the lunar lander to land for yourself.”

Personal Accomplishment: “I hope will be a satisfying experience when you implement an algorithm and then see that lunar lander land safely on the moon because of code that you wrote.”

Current State

  • Fewer real-world applications than supervised/unsupervised learning
  • Simulation-to-reality gap remains challenging
  • More research publications than production deployments

Future Potential

  • Large potential for future applications
  • Active research community
  • Fundamental pillar of machine learning
  1. Hype vs Reality: Significant gap between media coverage and practical applications
  2. Implementation Challenges: Real-world deployment more difficult than simulation success
  3. Application Frequency: Much less common than supervised/unsupervised learning
  1. Foundational Knowledge: Important for ML practitioner toolkit
  2. Future Opportunities: Potential for significant impact as field matures
  3. Specific Niches: Valuable for particular problem types (robotics, control, games)

Understanding reinforcement learning provides valuable perspective on the complete machine learning landscape, even if it’s not the primary tool for most applications. The conceptual framework enhances overall machine learning system design capabilities.