Self-Driving Cars Project Part 1

Colin J. Chu
3 min readNov 16, 2020

--

Welcome to the first of three posts about a Self-Driving Car Project I have had the pleasure to work on for the past few weeks.

We began by learning about the OpenAI Gym, an online environment that simulates a real car and environment through algorithms and programs we feed into the computer. More specifically, OpenAI is composed of states and observations (the actual environment of the simulation), actions (the actions the driver can take), and rewards (the results and effects of the actions).

Here’s what the OpenAI simulator looks like:

The objective of this project is to use machine learning and AI algorithms to determine and model a driver in the real world with a policy that uses reinforcement learning (RL). RL is a set of algorithms that learn from each other. It begins with an agent that uses a certain data set, creates an action, and learns from the reward to improve the model. In further detail, the basis of RL is made up of 3 steps: 1) an action that is made by the agent which changes the environment, 2) the change in the environment which affects the reward (impact) and state (situation), and 3) the reward and state which change the agent. The cycle is repeated several times. In professional and advanced RL cycles, these loops can be run continuously for millions of times!

Some examples of every-day programs that use RL include AI player-vs-computer games (the computer learns the strategies of the game), traffic light control (the machine learns common routes), online advertising (websites show ads that have interest to you), and AI assistance robots like Alexa (that learn from your common questions and conversations).

After learning about RL, we then translated the process to code. Our controller was created with the direction, gas, and break, respectively, for our parameters. For example, (-1, 1, 0) would tell the car to go backward at full speed. It should be noted that the gas and brake cannot both be 1 at the same given time.

The second part of our code required actions that we created to evaluate the state and reward. We did this by setting three more parameters: observation (what our AI driver can see), environment (the entire simulation), and time step (every timestep, our AI driver can choose a different action).

Lastly, with this, we were able to develop our RL model to identify color (brown = road; green = off-road), and acceleration (an increase in gas).

In summary, this first blog post discussed the use of reinforcement learning and coding for use with a self-driving car. In the second blog post, we will cover behavioral cloning — using neural networks for policies.

Colin Chu is a Student Ambassador in the Inspirit AI Student Ambassadors Program. Inspirit AI is a pre-collegiate enrichment program that exposes students globally to AI through live online classes. Learn more at https://www.inspiritai.com/.

--

--