Skip to content
Pablo Rodriguez

Continuous State Space

Example of Continuous State Space Applications

Section titled “Example of Continuous State Space Applications”

The simplified Mars rover example used “a discrete set of states” where the rover “could only be in one of six possible positions.” However, most real robots can be in “any of a very large number of continuous value positions.”

Instead of discrete positions 1-6, a Mars rover could be positioned anywhere on a line from “0-6 kilometers where any number in between is valid.” Examples include positions like:

  • 2.7 kilometers along
  • 4.8 kilometers
  • Any other number between zero and six

For controlling a self-driving car or truck smoothly, the state includes six numbers:

Position and Orientation:

  • x position: Location coordinate
  • y position: Location coordinate
  • θ (theta): Orientation angle (which way it’s facing)

Velocity Information:

  • ẋ (x dot): Speed in x-direction (“how quickly is this x-coordinate changing”)
  • ẏ (y dot): Speed in y-direction (“how quickly is the y coordinate changing”)
  • θ̇ (theta dot): Angular velocity (“how quickly is the angle of the car changing”)

Position

  • x, y coordinates
  • Orientation angle θ

Velocity

  • Linear velocities ẋ, ẏ
  • Angular velocity θ̇

Unlike the Mars rover with discrete states 1-6, the car state “comprises this vector of six numbers, and any of these numbers can take on any value within its valid range.” For example, “Theta should range between zero and 360 degrees.”

Controlling an autonomous helicopter requires an even more sophisticated state representation with twelve numbers:

Position (3D):

  • x: North-south direction
  • y: East-west direction
  • z: Height above ground

Orientation (3 angles):

  • φ (Phi): Roll (“rolling to the left or the right”)
  • θ (Theta): Pitch (“pitching forward or pitching up, pitching back”)
  • ω (Omega): Yaw (“compass orientation… facing north or east or south or west”)

Linear Velocities (3D):

  • : Speed in north-south direction
  • : Speed in east-west direction
  • ż: Speed in vertical direction

Angular Velocities (3D):

  • φ̇: Rate of roll change
  • θ̇: Rate of pitch change
  • ω̇: Rate of yaw change (“how fast is its yaw changing”)

“This is actually the state used to control autonomous helicopters. Is this list of 12 numbers that is input to a policy, and the job of a policy is look at these 12 numbers and decide what’s an appropriate action to take in the helicopter.”

Continuous State Markov Decision Process: “The state of the problem isn’t just one of a small number of possible discrete values, like a number from 1-6. Instead, it’s a vector of numbers, any of which could take any of a large number of values.”

These continuous state spaces are essential for:

  • Robotics control
  • Autonomous vehicle navigation
  • Flight control systems
  • Any application requiring smooth, precise control

“In the practice lab for this week, you get to implement for yourself a reinforcement learning algorithm applied to a simulated lunar lander application. Landing something on the moon is simulation.”

The transition from discrete to continuous state spaces represents a significant increase in complexity but enables reinforcement learning to tackle real-world control problems where precise, smooth actions are required rather than simple discrete choices.