Continuous State Space

Example of Continuous State Space Applications

Transition from Discrete to Continuous

Mars Rover Limitations

The simplified Mars rover example used “a discrete set of states” where the rover “could only be in one of six possible positions.” However, most real robots can be in “any of a very large number of continuous value positions.”

Continuous Position Example

Instead of discrete positions 1-6, a Mars rover could be positioned anywhere on a line from “0-6 kilometers where any number in between is valid.” Examples include positions like:

2.7 kilometers along
4.8 kilometers
Any other number between zero and six

Self-Driving Car/Truck Example

Multi-Dimensional State Vector

For controlling a self-driving car or truck smoothly, the state includes six numbers:

Position and Orientation:

x position: Location coordinate
y position: Location coordinate
θ (theta): Orientation angle (which way it’s facing)

Velocity Information:

ẋ (x dot): Speed in x-direction (“how quickly is this x-coordinate changing”)
ẏ (y dot): Speed in y-direction (“how quickly is the y coordinate changing”)
θ̇ (theta dot): Angular velocity (“how quickly is the angle of the car changing”)

Position

x, y coordinates
Orientation angle θ

Velocity

Linear velocities ẋ, ẏ
Angular velocity θ̇

State Space Characteristics

Unlike the Mars rover with discrete states 1-6, the car state “comprises this vector of six numbers, and any of these numbers can take on any value within its valid range.” For example, “Theta should range between zero and 360 degrees.”

Autonomous Helicopter Example

Complex 12-Dimensional State

Controlling an autonomous helicopter requires an even more sophisticated state representation with twelve numbers:

Position (3D):

x: North-south direction
y: East-west direction
z: Height above ground

Orientation (3 angles):

φ (Phi): Roll (“rolling to the left or the right”)
θ (Theta): Pitch (“pitching forward or pitching up, pitching back”)
ω (Omega): Yaw (“compass orientation… facing north or east or south or west”)

Linear Velocities (3D):

ẋ: Speed in north-south direction
ẏ: Speed in east-west direction
ż: Speed in vertical direction

Angular Velocities (3D):

φ̇: Rate of roll change
θ̇: Rate of pitch change
ω̇: Rate of yaw change (“how fast is its yaw changing”)

Real-World Implementation

“This is actually the state used to control autonomous helicopters. Is this list of 12 numbers that is input to a policy, and the job of a policy is look at these 12 numbers and decide what’s an appropriate action to take in the helicopter.”

Continuous State MDP Definition

Formal Characteristics

Continuous State Markov Decision Process: “The state of the problem isn’t just one of a small number of possible discrete values, like a number from 1-6. Instead, it’s a vector of numbers, any of which could take any of a large number of values.”

Practical Applications

These continuous state spaces are essential for:

Robotics control
Autonomous vehicle navigation
Flight control systems
Any application requiring smooth, precise control

Preview: Lunar Lander

“In the practice lab for this week, you get to implement for yourself a reinforcement learning algorithm applied to a simulated lunar lander application. Landing something on the moon is simulation.”

The transition from discrete to continuous state spaces represents a significant increase in complexity but enables reinforcement learning to tackle real-world control problems where precise, smooth actions are required rather than simple discrete choices.