Computer controlled model carThe
Department of Electrical & Computer systems Engineering,
and the
School
of Computer Science & Software Engineering
|
|
Supervisors:
Lloyd Allison,
Ronald Pose.
Second reader: Andrew Russell.
In this report a vision based navigation system allowing a computer to drive a model car around an arbitrary track was designed. A small video camera and transmitter were mounted on a toy radio controlled car, and processing was performed on a 100Mhz Silicon Graphics Indy.
Recognition of the track was accomplished in 2.8milliseconds by analysing only 0.17% of each video frame. It was found that by generating two moves (instead of one) each time the track was identified, the car sped up from 0.21m/s to 0.26m/s, with increased accuracy. Further improvements were obtained by concurrently driving the car whilst identifying the track, resulting in the fastest measured speed of 0.4m/s.
I thank Lloyd Allison, Ronald Pose and Andrew Russell for their invaluable guidance and suggestions throughout the year.
I also thank Jamie Scuglia, Annie Sio, Gary Evans, David Duke and Milton Richardson for the technical support and miscellaneous equipment provided.
A special thank you goes out to my family and friends for the support I received during this project.
1. Project description and motivations
*2. The hardware platform
*2.1 Operating system considerations *
2.2 Video hardware choices *
2.3 Construction of the robotic car *
3. Research and development of the vision based navigation system
*3.1 Navigational system structures used in similar projects *
3.2 The navigation system structure developed in this project *
3.3 Design of the track recognition sub-system *
3.4 Design of the low level controller sub-system *
4. Navigation system specifications and solution methodologies
*4.1 System communication *
4.2 Car metrics *
4.3 Reducing the track identification latency *
4.4 Velocity measurement *
4.5 Improving path planning *
4.6 High level control logic *
4.7 Low level controllers *
5. Experimental results and discussion
*5.1 Track recognition processing times *
5.2 Velocity measurement results *
5.3 The effect of improved path planning *
5.4 Effectiveness of the low level controllers *
6. Future research recommendations
*7. Conclusions
*8. References
*9. Appendices
*Appendix 1 – Screen captures *
Appendix 2 – Intensity plots *
Appendix 3 – Scan line averaging and edge detect plots *
Appendix 4 – Open loop 12 position simple path planning controller and error plot *
Appendix 5 – Open loop 12 position controller and error plot *
Appendix 6 – Open loop 200 position controller and error plot *
Appendix 7 – Open loop 200 position controller, multithreaded execution and error plot *
Appendix 8 – Closed loop 12 position controller and error plot *
Appendix 9 – Closed loop 200 position controller and error plot *
Appendix 10 – Closed loop 200 position controller, multithreaded execution and error plot *
Appendix 11 – Makefile and program C code 49
1. Project description and motivations
To enable a computer to autonomously drive a radio controlled model car around a small track containing an unknown number of corners, each having a previously unknown radius, is an interesting and challenging task.
In order to realise the above goal it was necessary to consider areas as diverse as: RF transmission and reception; providing small remote power sources; identifying operating system constraints; choosing a programming language; performing real time image processing; implementing real time control; and designing a logical decision centre. These areas then had to be integrated (along with accurate craftsmanship) in order to develop the navigation system controlling the robotic car.
Using a $400 budget a suitable miniature digital video camera and transmitter had to be purchased and mounted on the car. A colour SVHS quality video camera was chosen, and transmission was implemented with a Dick Smith video sender unit. The video equipment had to be powered by a 12V regulated DC source, which comprised of regulating 12 AA NiCd rechargeable cells. A VCR tuned to UHF channel 30 was used as the receiver. A 100Mhz Silicon Graphics Indy was used in this project, and was chosen for its existing video hardware and software libraries.
The aims of this project are to identify the track quickly in order to maximise both the speed and accuracy of the car, and to eventually allow for moving objects such as pedestrians and other cars.
In order to navigate the car successfully it is first necessary to perceive where the track is in relation to the car, then to generate an appropriate sequence of moves at the correct time, to place the car in the centre of the track and parallel to it. This process is continually repeated allowing the car to complete successive laps. An inherent challenge in using this method is that by the time the computer has identified where the track is the moving car has already changed position, and the sequence of moves generated will be out of date by this amount of time (referred to in this report as track identification latency). This imposes an upper limit to controllable speed, hence it is crucial to identify the track as quickly as possible. A further reduction in this latency investigated in this report involves splitting the navigation system into two separate processes. The first process identifies the track, while the other process drives the car. The track identification latency then depends on the smaller amount of time between the driving process finishing its cycle, and the execution stage of the track identification process.
Research was also conducted into only using visual cues for measuring vehicle velocity instead of using hardware sensors, in order to investigate the plausibility of using only the video stream for input to the navigation system.
This chapter describes the design choices made for building the car, and the reasons why a particular operating system was chosen to implement the navigation system.
It was decided to transmit a video stream from a camera mounted on the model car to a base computer controlling the car, instead of directly mounting the computer and camera on the car itself. This choice was made based on developmental times for each configuration.
2.1 Operating system considerations
Operating systems vary in the way they capture video data. This includes the physical hardware connectors for composite video/TV antenna input, and the low level operating system control over the incoming video data stream.
The main reasons for choosing a particular operating system are programmability, speed, and cost.
Silicon Graphics based implementation of video
Silicon Graphics Indy machines have a built in PAL composite video input that can be accessed using standard VLTM library functions. For this reason additional hardware does not need to be designed in order to provide real time video data to programmers.
PC Based implementation of video
By use of a frame grabber card and provided frame manipulation software it is possible to access high resolution real-time video data. Unfortunately these hardware/software packages retail for above $1500 (Total Turnkey Solutions 1999), making these cards too costly for this project.
A less expensive TV tuning card and supplied software is capable of tuning into the common television frequency bandwidths, however the provided proprietary software only displays video on the screen or allows capture of the data stream to an AVI or MPEG file on the hard-drive. In order for the video stream to be immediately available to the programmer, further technical information would be required. None of the vendors contacted would supply this information.
Utilising the enhanced serial port called 'USB' (a 12Mb/s token bus) video data may be read directly from the port. However this method suffers from the above problem that the required technical information is not available.
In order to use a PC cheaply for this project a frame grabber card would have to be designed and constructed so that the hardware/software communication modes are known. For this reason a 100MHz Silicon Graphics Indy running Unix was chosen over a faster PC.
In choosing a transmitter/receiver pair the main considerations are signal range and directional constraints, size, power consumption and cost.
Video Transmitter/receiver choice
Radio/UHF frequencies (MHz) are lower on the E-M spectrum compared to microwave frequencies (GHz). For this reason RF/UHF transmission is less directional than microwave transmission which usually is quoted as a 'line of sight' wireless connection.
The microwave video system that Jaycar Electronics (1999) distribute is named "Third Eye" ™ and costs $599 for each camera/transmitter/receiver package and has a quoted range of "150m line of sight". Sony have a more expensive system costing $550 for each transmitter/receiver (without a camera) having a quoted range of "400m line of sight". Unfortunately all commercially available microwave receivers lose their incoming signal when the angle between them and the transmitter increases beyond 30 degrees. This means that for a stable signal from the RC car driving on a track with a 5 metre radius the receiver must be positioned in the centre of the track elevated no less than say 5/tan(30) = 9 metres directly up, which is too rigid a constraint for this project.
Commercially available IR (infra red) transmitters/receivers could be used for this project however the additional hardware of a fast A/D converter connected between the video camera and the transmitter, and a similar interface between the IR receiver and the computer would have to be designed and constructed. For this reason an IR system was not used.
The most powerful unlicensed commercially available RF/UHF video transmitters operate with a non-directional range of up to 10 metres. This means that provided the transmitter and receiver are within 10 metres of each other anywhere in three dimensional space a clear picture will be received. The ARISTA AVS 30 was chosen based on size and power consumption (12VDC 300mA). The unit transmits nominally on UHF channel 30 but can be adjusted so that more than one unit can be operated simultaneously. The receiver will be a standard VCR tuned to the correct channel. The AVS 30 unit costs $72.
Video camera choice
In choosing a video camera the primary concerns are size, viewing angle, minimum light operation, and cost.
A viewing angle greater than 100o would be ideal for this project, however video cameras capable of providing these wide views are physically too large to be mounted on the RC car. It was then necessary to choose a miniature video camera having the widest viewing angle available.
Allthings Sales and Service (1999) priced their DSP CAM-COL 554 camera at $50 less than there closest rivals (Jaycar Electronics QC-3483, 1999), with superior functionality and specifications as outlined below in table 1.
Table 1:
video camera specifications|
CAM-COL 554 |
QC-3483 |
|
|
Horizontal resolution |
400+ |
330 |
|
Minimum illumination |
2.5 Lux |
5 Lux |
|
SNR |
45+dB |
45dB |
|
Maximum viewing angle (edge to edge of the screen) |
75o |
70o |
|
Price |
$199 |
$249 |
The CAM-COL 554 also offered the option of interchangeable lenses ranging from 2.1mm (75 degree viewing angle) to 16mm (12 degree viewing angle), the QC-3483 did not have this feature. The CAM-COL 554 also offered "MicroFine" focusing for extra fine rotation of the focusing mechanism, automatic gain control, and back-light compensation, all of which were not available on the QC 3483.
Miniature digital video cameras of comparable specifications and features may be purchased overseas, however additional freight and insurance costs must be added.
Lens choice
The CAM-COL 554 comes with a standard 3.6mm F2.0 board lens, giving an effective viewing angle of 48 degrees. Additional board lenses include 2.1mm, 2.4mm, 2.9mm, 3.2mm, 4mm, 4.3mm, 5.7mm, 8mm, 12mm and 16mm. Choosing the widest viewing angle lens (2.1mm F2.5) gives a viewing angle of 75 degrees. Since the advertised horizontal resolution is 400 pixels, at a distance of one metre the 3.6mm lens will view a horizontal width of 2*tan(48/2) = 89cm. A 4cm horizontal width (ie masking tape) at one metre will thus occupy 400*(4/89) = 17 pixels. Similarly a 4cm wide piece of masking tape positioned at a distance of one metre using the 75o lens will occupy 400*(4/100*2*tan(75/2)) = 10 pixels, which will be adequate to perform pattern recognition on.
2.3 Construction of the robotic car
The platform holding the camera, transmitter and power system is elevated by 13° in order to increase the viewing area of the track, without being elevated so much as to cause instability of the car around tight corners. The platform is attached to the car using 3mm threaded rod and lock nuts, which allow the platform angle to be adjustable.
Initially the camera was mounted on a sponge base secured by four springs in order to provide damping from motor vibration. As the project developed and the speed of the car increased, it was found that the camera moved by up to 5° relative to the platform, which adversely effected control of the car. The sponge and springs were then replaced by a 4mm thick rubber sheet which overcome this problem and still provided vibration insulation.
The camera is positioned 23cm from the front bumper of the car which measures 39cm bumper to bumper. The distance between the two axles of the car is 23cm, while the distance between the outer edges of the rear tires is 22cm. The car does not have proportional steering in the sense that the front wheels can only be straight, and full lock left or right (which is 22° from the central position).
Powering the camera and transmitter
The video camera requires a regulated 12V DC 150mA power source, while the transmitter requires a 12V AC 300mA power source. The transmitter's power input circuit uses a half-wave rectifier and smoothing capacitor to transform the 12V AC signal to a rough DC signal which is then fed to a 9V DC regulator. By this method it is fine to supply the transmitter with the same 12V DC source as the video camera to operate them both correctly.
The circuit for providing the 12V DC regulated power is shown below in figure 1.

Figure 1: 12V regulated DC.
An in line fuse is connected to the video camera but not the transmitter since it contains its own surge protection. If the power consumption is as specified then the system will operate for 1.6 hours using 750mAh rechargeable cells. The 12AA NiCd cells are housed in three packs (containing four cells each) mounted on the car underneath the platform. In-line connectors are used which enable the batteries to be charged without removing them from the car. These connectors also allow the camera/transmitter to be connected to a power supply other than the battery packs. The camera/transmitter’s on/off switch is located on the battery pack at the rear of the vehicle.

Close up of the initial camera assembly
3. Research and development of the vision based navigation system
3.1 Navigational system structures used in similar projects
The design of a vision-based navigation system for the control of a vehicle depends on the required function of the vehicle. The next section of this report investigates the design and implementation of several vision based navigation systems used in current passenger and model sized autonomous vehicle projects, which are similar to the project developed here. This provides a background for the design choices made in this project.
Passenger sized vehicles
All of the prototype autonomous vehicles developed around the world are still limited in their maximum speed by their vision system computing requirements, except for the VaMoRs project, which is the world's fastest autonomous road vehicle. VaMoRs was developed by V. Graefe and K.D. Kuhnert (1992) at the Universitat der Bundeswehr Munchen. The project consists of a 5 ton van containing sensors for velocity, angular velocity and accelerations. The vision system comprises of two monochrome video cameras, a short focal length wide view and a telescopic view. The real time image processing and feature extraction is computed on a BVV 2 multi processor computer, while the control and high level computer vision is executed on an IBM IC computer (this configuration results in a visual system fast enough to allow speeds up to the maximum speed of the van itself). By recognising regions of differing grey levels in the real world 2-D video camera image, certain features such as borders between road surfaces and gutters etc can be extracted. By predicting where a particular feature is expected to be, the search module is sped up by processing a smaller region of the image. VaMoRs navigates by comparing this external real world feature image to a predicted feature image derived from an internal 4-D model of the real world. Many times a second the two feature images are compared and the differences are used to adjust the internal model of the world, so that it closely follows changes in the external world. The state of the environment is then known via the internal model in regards to object identification and position, while the state of the vehicle is known via its motion sensors and control actuator positions. By combining these known states with the physical laws of motion, control signals can then be generated.
As yet there does not exist an autonomous vehicle capable of navigation through ordinary traffic, with the speed and safety of a human driver. VaMoRs is capable of driving at its top speed of 98km/h on an open freeway, and up to 50km/h on an unpainted road, with the constraint that the vehicle must drive on empty roads.
This type of real-time 4-dimensional environment mapping would be unachievable on a modern desktop computer. The autonomous vehicle in this project will not need to recognise situations with the same generality as VaMoRs, and a simpler vision based navigation system will suffice.
Model sized vehicles
Bijan Shirinzadeh (1997) of the Robotics and Mechatronics Research Laboratory at Monash University Clayton, is currently researching simplified strategies for tracking and control of vision based model robotic vehicles. A pentium ™ computer is connected to an overhead video camera via a frame grabber card. Utilising a specially designed circuit board the computer controls two servos which are connected to the joysticks of the car’s hand held transmitter. An array representing position versus time (with regards to angular rotation, and speed) is updated each time the tracking procedure is executed, resulting in a predictive tracking strategy. As a result it is not necessary to search the entire video frame to acquire vehicle position and orientation information. Path planning is via the distance transform over tessellated space, and was chosen due to its ease of implementation.
Ning Chen (1997) of the California State University is proposing an alternative to the IEEE Region Six annual micromouse competition. The micromouse competition is restrictive in the sense that a specialised maze and robotic vehicle equipped with on board microcontrollers are required. Ning Chen’s alternative is to use an off-the-shelf radio controlled model car, CCD camera, 900MHz transmitter/receiver and VCR, and a custom designed frame grabber card. The race-track consists of a white board with black dashed lines for the track. The computer vision technique used is to reduce the 512*512 image to 256*256 resolution, then to average each 8*8 pixel sub-image block into a single block which is either black or while. Ultimately the resolution is 32*32. The software then scans all of the 32*32 blocks and groups clusters of them into one larger logical black block. Students are then required to use this information to program the race car to navigate the track as fast as possible. The system is run on a PC using the C programming language.
Both of the model sized vehicle projects above have similarities to the project in this report, but are different in significant ways. The development of the navigation system in this report will allow the car to drive on a track of any configuration, which will be surrounded by shadows and objects.
Also unlike Chen's competition the project in this report will investigate speeding up the car by choosing a pixel sub-image block size which will minimise the amount of processing, and hence speed up the visual system.
Multithreaded navigation system structure
G. Salgian and D.H. Ballard (1998) outlined visual routines for navigation based on colour and shape identification. The hardware platform they used included an expensive pipelined video processor in order to remove video processing from the main computer. Salgian and Ballard developed a large library of small visual routines which were called when a particular function was required, instead of using a single larger routine. Scheduling decisions for these routines were based on arranging them into a hierarchy of decreasing importance. They found that improved performance in vehicle behaviour was achieved by scheduling the crucial processes at a higher rate. When several crucial processes needed to be executed concurrently, acceptable vehicle performance was only obtained by reducing the vehicles' speed.
The project in this report will use slower computer equipment, and hence will implement a less general navigation system. This will result in fewer processes and simpler scheduling techniques.
Path planning techniques
Path planning involves identifying valid routes in which a mobile robot may take. R.A. Jarvis (n.d.) outlined optimal path planning methods through various types of obstacle fields, based on distance transforms. Other techniques such as A* allow similar planning.
The car in this project is constrained to follow a road which will eventually contain only a small number of obstacles. Hence path planning will not need to be as general as that provided by the above two techniques, and a simpler approach will suffice.
3.2 The navigation system structure developed in this project
The function of the navigation system designed in this project is to allow the car to complete a lap of an arbitrary track. This is achieved by identifying the conditions that: the car is in a straight section of the track; the car is in a corner; the car is running off the track; and the car has run off the track. The correct motion of the vehicle is then obtained by generating a series of moves to react to these above conditions.
The navigation system developed in this project comprises of three main sub-systems: a central decision centre; a track recognition sub-system; and a low level control sub-system.
The function of the track recognition sub-system is to identify the position of the car in relation to the track, to measure vehicle velocity, and to recognise the presence of objects between the track borders. This information is then passed to the central decision centre which identifies the above conditions and initiates the appropriate action. If the car is running off the track or has run off the track, then the central decision centre performs the vehicle recovery directly. However if the car is on a straight or curved section of the track then the control of the vehicle is delegated to the low level control sub-system.
A block diagram of these systems is shown below in figure 2.

Figure 2: the navigation system structure developed in this report
A reduction in the track identification latency will result in an increase in controllable vehicle speed. This will be achieved by modifying the sequential navigation structure in figure 2 to a multithreaded structure. Further discussion of this modification can be found in section 4.3.
3.3 Design of the track recognition
sub-systemThe design of the track recognition sub-system involves: recognising straight lines and curves in video frames; measuring vehicle velocity by examining the rate in which objects change position between successive video frames; and identifying objects between the track borders. An investigation into current methods of achieving these functions is outlined in the following section.
Successive image division
A method for locating and identifying objects in a robot’s environment is suggested by R. Duda (1970). The video image is divided into regions of constant brightness. Then from a predetermined list of expected objects a probability is assigned to each unidentified region specifying its likeness to an expected object. By applying a set of rules (such as a floor cannot be adjacent to a ceiling etc.) a score can be calculated as the sum of the difference losses between assigning regions to an object type other than their most probable choice. The scene is analysed most accurately when the lowest scoring legal selection of region types has been made.
Other scene analysis methods exist for performing successive image division, such as those researched by Brice and Fennema (1970), and R.A. Jarvis (1975) who presents a general solution to image segmentation by combining line identification with region division under a semantic structure.
All of the above methods would be as general as needed for their particular applications. For this project there would only be three types of expected objects: two road borders and possibly an obstacle. With this limited set of expected objects the generality of the above algorithms will not be required and a simpler approach will be taken.
Shape identification and clustering techniques
Hough (1962) proposed a method for detecting different shaped edges such as straight lines, circles, parabolas (and any other shape in which a parameterisation could be found) in images. R. Duda and P. Hart (1972) suggested an alternative straight line parametreisation (xcosq +ysinq =r ) which eliminated the ill-valued problem of infinite gradient lines. Duda and Hart also pointed out that with an n-parameter representation required for identifying a complicated shape, an n dimensional grid needs to be accumulated and the computation grows exponentially.
A Hough transform will be able to identify the mathematical equations of the road borders in this project, however it will be unnecessary to apply a technique as general as this. A faster alternative method will be developed in this report will only analyse a small portion of each video frame.
R.A. Jarvis and E.A. Patrick (1971) present a non-parametric clustering technique capable of partitioning image data points into different classes which are based on mathematical similarities and dissimilarities. However, since the types of data point distributions representing the track borders are known in this project, the generality of this approach is not necessary.
Determining camera velocity using optical flow
By determining the optical flow field between two images obtained from a moving camera, the camera motion can be calculated by using the lens equation developed by B. Horn (1986). A simplification to this method was presented by Sandor Fejes and Larry Davis (1998), by analysing the vector projections of the optical flow field. Unfortunately this method is too computationally expensive to be implemented in real time on the computer used in this project, and a different approach is required.
David Vernon (1999) presented a rigorous method for determining an optical flow field down to sub pixel accuracy, and as above this method is too slow to be implemented in real time.
G Quenot, J. Pakleza and T. Kowalewski (1997) presented a method in which the optical flow field is calculated by initially performing course matching between the two images in both the x and y directions using large sub-image bands. The course flow field obtained from this first step is then used to transform the first image, and subsequent matching is then performed between the second image and this newly transformed image using smaller sub-image bands. The process is iterated, and the accuracy increases with the number of iterations.
By performing only one such iteration on a small area of the video frame, a useable resolution may be achieved without requiring a large processing time. This will be investigated in this report.
Tracking of moving obstacles
Real time tracking cannot be achieved using the optical flow methods described above, since processing times are too prohibitive.
N. Paragios and R. Deriche (1998) outline a statistical approach to tracking moving obstacles. Using a 75Mhz SPARCstation20 64MB memory computer, processing times were in the order of 1 second for each 50*50 pixel image. This method of real time tracking is thus not possible using the computer in this project.
An efficient algorithm for tracking a moving object in the presence of noise for a discrete real time system, as in this project, is the tracking Kalman filter. C. Chui and G. Chen (1987) present various Kalman filters for real time applications.
3.4 Design of the low level controller sub-system
There are many reasons for choosing a particular method of control for the low level control sub-system implemented in this report. By considering various control strategies a suitable choice was made and is outlined below.
Neural network (A. Zalzala and A. Morris 1996) and fuzzy logic (S-J. Chang, C-W. Cheng and T-H S. Li 1997) (M. Ohkita, H. Miyata, M. Miura and H. Kouno 1993) controllers are used primarily for solving complex navigational tasks. The vehicles involved usually contain a large number of on board sensors for the acquisition of range and speed data. Navigation in this project will therefore not be based on neural or fuzzy control, and a simpler method will be used in this report.
A time optimal (Bang Bang) controller can be used to control a vehicle having limited power by applying full negative/positive torque for accelerating/deccelerating the load depending on the bearing error. A solution was presented by K. Ogata (1970) for a discrete time system. This scheme can only be implemented if accurate velocity measurements are available and hence the simpler style of open/closed loop proportional controllers (A. Megstel 1991, G. Rzeuski 1994) will be used while the velocity module in this project is being developed.
4. Navigation system specifications and solution methodologies
This chapter outlines in more depth how navigation is to be achieved. This involves solving the problems of sensing the location of the car relative to the track, achieving communication to the car, speeding up the visual system, measuring velocity, improving path planning, and implementing low level control of the car.
Communication to the car is achieved via an interface connecting the serial port of the computer to the hand held transmitter supplied with the car. The interface output is wired directly to the switches in the hand held unit, which are normally operated by a person moving the joysticks. By sending a particular character byte out the serial port of the computer the car executes a particular move. Communication from the car is via a transmitted UHF video signal. A VCR with a UHF antenna converts this signal into a composite video signal for input into the Silicon Graphics Indy.
The quality of video signal reception is increased by placing the antenna on a high vantage point, preferably mounted upside down from the ceiling. Further improvements would include the use of an inline signal amplifier, or the design of an aerial specifically tuned to UHF channel 30 (carrier:543.25MHz, limits:542-550MHz).
Since the viewing angle of the camera is only 75°, for practicality reasons the testing track could either be wide with shallow corners or narrow with tighter corners. The advantage of using a narrower track with tighter corners are many, the first being that this setup allows a longer and more interesting track to be laid out in the computer lab. The next advantage is that by having tighter corners, the effectiveness of different control techniques will be more prominent.
Since one of the aims of this project is to increase the speed of the car, it was decided that instead of calculating the position of the car within the track and its bearing, processing time could be reduced by calculating bearing only and interpreting a position error within the track as a bearing error. This assumption is valid (for small to moderate car velocities) since the range of position errors will be small compared to the range of bearing errors for the narrow track chosen.

Figure 3a: position error of 10 Figure 3b: bearing error of 10
Figure 3a shows a position error of 10 while figure 3b shows a bearing error also of 10, and the green dot represents the centre of the track. In both cases the car needs to be steered to the right for a small amount of time. This correction will leave the car in figure 3b parallel to the track and in the centre, however the car in figure 3a will be in the centre of the track but require a negative bearing correction before it is parallel to the track. This bearing correction will be performed in the next execution cycle.
4.3 Reducing the track identification latency
For any robotics project the overall speed of the vehicle will depend on how fast the visual system can deliver environment information to the decision centre, and by how much time this information is out of date (track identification latency). For this project it is desirable to implement a visual system that will process enough information to allow some path recognition and planning, but primarily cycle fast enough to generate control signals capable of navigating the car at modest speeds, by minimising this track identification latency time. By exploiting known information about the track environment, the initial task will be to identify the track lines as quickly as possible in the video image.
Figure 4a shows a time diagram of a simplistic vision-based navigation system. Under this system the frame rate (and hence rate of command generation) = 1/(Ta+Tt+Tc). The problem with this vision-based navigation systems is that the robot is acting on a track identification that is out of date by Ta+Tt amount of time. Ta is a fixed overhead and can only be reduced by using a faster CPU, thus reducing Tt (track recognition time) is a primary concern of this project.

Figure 4: (a) simple navigation system,
(b) multithreaded navigation system
An improvement to the simplistic vision-based system above is the multithreaded vision-based navigation system shown in figure 4b. The parent process concurrently identifies the track while the separate child process drives the car. The frame rate in this case depends on 1/(Ta+Tt) and will produce more commands per second. The track identification latency time will be at worse Ta+Tt, and will range from this time down to 0 time depending on the execution stage of the parent process when the child process finishes its driving cycle. The time required to drive the car (Tc) will vary for different bearing error inputs to the proportional controller, but in general this time will be the shortest amount of time required to move the car in a useful manner. Results using these systems may be found in section 5.4.
To turn a moving vehicle in order to compensate for a bearing error requires a shorter turning time for large velocities than for small velocities. An equation relating the amount of time to hold the steering left or right in order to reduce the bearing error to zero is:
T=K*1/Vf*error.
Where K is a constant, error is the bearing error, and 1/V f is the reciprocal of the cars forward velocity. Accurate control therefore depends on knowing the forward velocity of the vehicle.
There are hardware solutions to measure the car’s velocity, such as accelerometer chips or shaft encoding, however in the spirit of this project only visual cues will be used.
Extracting velocity information from the rate at which the car approaches a corner will only give velocity measurements at the ends of the straight sections of track. The testing track used in this report only contained three such sections, and since velocity is required all the time, an optical flow approach was taken. This approach involves identifying a fixed object in two frames and relating the shift in its position to a shift in the camera position via lens metrics. This involves feature extraction in the first frame and feature matching in the second frame, with the problem that objects disappear at the edges (and also appear when rotation is introduced). Processing times tend to be prohibitive for current solutions to this problem, and a faster method is required for this project.
The first solution investigated in this report uses one iteration of the algorithm developed by G Quenot, J. Pakleza and T. Kowalewski (1997) and was described in section 3.3. A search window is placed close to the center of expansion of the flow field so that the window size (and hence processing) will be a minimum. This window placement also allows the magnitude of the y flow field component to be approximated by a constant value across this small window, thereby reducing the distortion effect between the two frames. When the camera moves forward into the scene the objects in the second frame have moved upward compared to the objects in the first frame, and this vertical shift represents the amount of forward camera movement. The camera velocity is then calculated by using the forward camera movement and the amount of time between acquiring both frames. Feature extraction in this method is achieved by performing an edge detection on the search windows in both frames, and placing the number of edges/horizontal line into two arrays. Feature matching is then performed by finding the minimum error match between these arrays by shifting the first array upward (for a forward camera movement). This method is valid only under the assumption that the amount of identifiable horizontal and vertical edges appearing/disappearing in the search window is small compared to the total amount of edges in the window, otherwise a large distortion will occur in the window data. Since no horizontal shape information is gathered in the feature extraction process, this method allows moderate camera rotations. Section 5.2 contains the experimental results and discussions on implementing this method.
Instead of matching an entire window as above, an alternative approach involves selecting an area of pixels in the centre of the first frame's search window and matching this to a position in the second frame's search window. A shift in the x and y position of this small area of pixels then represents a forward camera shift. By centreing these search windows close to the focus of expansion the search windows are small, and distortion effects are minimal. If matching is still unsuccessful then lens distortions can be compensated for by compressing the data taken from the middle of the first frame when matching it near the edge of the screen in the second frame, which is distorted is this fashion by the wide angle lens. Camera rotation can not be compensated for using this method, and hence it can only be used when the car is travelling in a straight line. Section 5.2 contains the experimental results and discussion on implementing this method.
Since developing a vision based speedometer is no small task this research will be conducted along side the development of a controller for the car which assumes a moderate constant forward velocity. The speed of the car will be regulated by pulsing the engine, and also applying a small reverse to the engine when the error is large (this assumes that the car is travelling too fast for the corner being taken).
By sending one turn move per control cycle to the car when a corner is detected, the car only reacts to a corner once it is in it. An improvement to this method is to look further ahead down the road and use two moves (a forward followed by a turn), such that the car plans for the corner by turning just before it enters it. Unlike the previous method in which a large error is corrected once it has occurred, this method attempts to correct the smaller error just before it occurs. For this reason the car is expected to perform with increased accuracy, or alternatively for the same accuracy the average velocity will be increased. The comparative results of the above two techniques can be found in section 5.3.
The decision centre controlling the behaviour of this mobile robot includes seven states. In the initial state the computer acquires a frame and attempts to recognise the track. It then moves into one of four states reflecting the conditions of identifying no track edge borders, one track edge border, two track edge borders, and more than two track edge borders. These states are called zero track border state, one track border state, two track borders state, and >2 track borders state respectively. There are two additional states for when the car is lost called the lost state (which can be recovered from) and a give up state which is reached when the car cannot recover from the lost state.
When no road edge lines are identified the car move into the lost state, where if it is the first or second time in a row in this state then the car reverses straight back and attempts to find the track by returning to the initial state. If it is the third time in a row in the lost state then the car powers down and finishes this particular run by entering the give up state.
If it is the first time in which the one track border state is reached then the car predicts the middle of the track using the running average of the track width, and drives a small distance then returns to the initial state. However if it is the second time in a row in this one track border state then the car reverses with its wheels turned in the appropriate direction, and returns to the initial state. If the two track borders state is entered and the width of the track matches the running average of the track width, then the car drives and returns to the initial state. However if the track width is not consistent with the running average width, then the car enters the lost state. If more than two track borders are identified then the previous track position is used to find the current track position, and the width of the track is used to determine wether the car will drive and return to the initial state, or enter the lost state. A diagram showing the flow of control through the various logical states is shown below in figure 5.

Figure 5: state diagram showing the flow of control.
The control of the car will be optimized for speed primarily with enough accuracy to stay within the limits of the track.
Once the bearing error has been obtained and the high level control decides to drive the car in a forward direction, it is then a matter of sending an appropriate signal to the car.
Initially this will be achieved by using a constant gain open loop controller. On the success of this method, error feedback will then be introduced to modify the gain, changing the open loop controller to a closed loop controller.
Providing that the forward velocity measurement is accurate enough, a time-optimal (or Bang-Bang) controller will then be implemented, which is a natural solution to the control of this car with its non-proportional steering.
5. Experimental results and discussion
This chapter contains the experimental results obtained from implementing the solution strategies put forth in section 4. These results give visual system processing times, velocity measurement results, the effect of improved path planning, and the effectiveness of the low level controllers used.
5.1 Track recognition processing times
Initially a colour difference was used as the search method for extracting track lines. The pink coloured pixels in Figure 6a shows all the pixels in the range of RGB(37-75,38-82,42-93) chosen by examining 5 different pixels on a track line. When the car is rotated as in figure 6b the track pixel values change with the different auto gain controls of the camera for the slightly different ambient lighting. As a result the track is no longer identified. By readjusting the threshold range by sampling some pixels from this new frame, the search space increases to RGB(18-205,25-206,34-217) as shown in figure 6c. Although figure 6c shows a strong indication of the track lines, the search space spans over 70% of the possible range of pixel values, and hence does not discriminate track lines adequately from the surrounding shadows.

Figure 6a: correct adjustment,
Figure 6b: a change in position.
Figure6c: readjustment of threshold levels threshold levels
It was then decided to convert the video data into 8-bit grey scale intensity values, and identify sharp intensity differences by use of an edge detection algorithm. Unfortunately the above operations are time consuming to process, and the absolute minimum search area would have to be used, preferably one scan line.
Appendix 1 shows two screen grabs of the track. The small green dot shows the position of the scan line and the middle of the track being identified. Appendix 2 contains the grey scale intensity plots of these two screen grab scan lines. There are some circumstances where the surrounding shadows appear similar to the track boarders, however this is not apparent in these particular examples. Black strips of cloth were found to give the highest contrast difference compared with the particular carpet in the computer lab. The black track borders appear as dark troughs in the data, which can vary in both amplitude due to the ambient light level, and width due to the angle of the car (and lens distortion). The task was then to extract the positions of these troughs in the presence of shadows. Initially a differentiation of the data contained shadows with large enough rates of change to produce sharp spikes not to dissimilar to the track edges. This problem was overcome by averaging the raw data into pixel sub-blocks first, as shown in appendix 3. It was found that by averaging the image into pixel sub-blocks this algorithm is more robust when processing noisy data. It is then only a matter of noting a large positive spike followed by a large negative spike (within suitable bounds of width and height) to identify the track edges from the carpet and surrounding shadows.
For illustration purposes figure 7 shows a screen capture of applying this edge detection to every scan line in the image.

Figure 7: edge detection for 6*1 pixel sub-blocks, with 4 wide differentiation.
The next task was to choose the best size sub-pixel division with regards to resolution, noise effects and total processing time. Table 2 below shows timing information for 4*4 down to 1*1 pixel division, arranged in decreasing processing time. All measured times had an associated error of 4%, and are the averaged times of a 30 second interval operating at 6 frames per second.
Time to acquire a video frame: 108950 microseconds.
Time to write a frame to the screen: 31769 microseconds.
Table 2:
visual system times are in microseconds|
Block size (width*height) |
Horizontal resolution |
Grey scale time |
Averaging time |
Edge detect time |
Identify the track time |
Total time in microseconds |
|
4*4 |
192 |
9134 |
1346 |
126 |
55 |
10661 |
|
3*3 |
256 |
6904 |
976 |
158 |
56 |
8904 |
|
4*3 |
192 |
6866 |
868 |
124 |
52 |
7910 |
|
2*2 |
384 |
4585 |
1127 |
198 |
59 |
5969 |
|
3*2 |
256 |
4598 |
831 |
161 |
55 |
5645 |
|
4*2 |
192 |
4582 |
607 |
97 |
60 |
5346 |
|
2*1 |
384 |
2304 |
937 |
180 |
53 |
3474 |
|
3*1 |
256 |
2296 |
667 |
119 |
52 |
3134 |
|
4*1 |
192 |
2321 |
525 |
96 |
56 |
2998 |
|
5*1 |
153 |
2314 |
454 |
81 |
54 |
2903 |
|
6*1 |
128 |
2325 |
397 |
71 |
53 |
2843 |
|
7*1 |
109 |
2310 |
363 |
64 |
56 |
2793 |
|
1*1 |
768 |
2346 |
0 |
360 |
73 |
2779 |
|
8*1 |
96 |
2310 |
324 |
55 |
53 |
2742 |
|
9*1 |
85 |
2307 |
310 |
50 |
50 |
2717 |
|
10*1 |
76 |
2312 |
292 |
49 |
52 |
2705 |
|
15*1 |
51 |
2300 |
235 |
50 |
55 |
2640 |
The processing time increases proportionally with the pixel block height, and thus a size of n*1 was a good starting point. Of the sizes considered (15*1 to 1*1) the resolution of bearing error increases as the inverse of block width, and so a block width of 1 gives 15 times the resolution of a width of 15. However the effect of noise is more prominent for smaller block widths. It is thus necessary to choose the smallest width (to maximize the resolution) which is still resilient to noise.
Figure 8a is two screen captures overlaid showing how a 15*1 block size has a poor resolution for a bearing error of 1. Figure 8b shows how this same resolution allows identification of the track in noise (the green dot is still central to the track borders). Figures 8c and d show how a 6*1 block size has an increased resolution and can still identify the track in noise.

Figure 8a: 15*1 block size, bearing error change of 1
Figure 8b: 15*1 block size, correct track
identification in noise

Figure 8c: 6*1 block size, bearing error change of 1
Figure 8d: 6*1 block size, correct track
identification in noise

Figure 8e: 4*1 block size, bearing error change of 1
Figure 8f: 4*1 block size, bearing error change of 1
Figure 8e and f show how a 4*1 block size has a high resolution, but loses track identification in noise (the blue dot to the left of the track represents a track midpoint when more than one line is identified. In this case noise components are interpreted as lines). A block size of 5*1 also suffered this noise problem, and thus a block size of 6*1 was used in this project. By using only 1 scan line the percentage of pixels examined in order to find the track is 0.17%, since the initial resolution is 768*576.
Using this sub block pixel size, and the above mentioned height and width constraints to filter off shadows, it was found that the track could be identified up to 90cm from the front of the car. This range can be extended by choosing a smaller block width (increasing the resolution) at the expense of reducing noise tolerance.
The differentiation was performed by using a 4 wide sliding window and calculating an intensity derivative of blocks 1+2-(3+4). The 4 wide sliding window was the most successful width to use for the particular black borders used in this project.
Using the notation in figure 4 the expected track identification latency time and frame rate for the simple vision based navigation system are calculated below.
Ta=109ms, Tt=2.8ms, Tc is from 280ms to 720ms depending on the controller used.
Track identification latency time=Ta+Tt=111.8ms.
Frame rate=1/(Ta+Tt+Tc)=1.2 to 2.55 frames/second
It should be noted that the latency time will be slightly longer, and the frame rate slightly lower due to the additional overheads of setting miscellaneous variables, and operating system management.
For the multithreaded system:
Ta=109ms, Tt=2.8ms, Tc is from 280ms to 720ms depending on the controller.
Track identification latency time=Ta+Tt=0 to 111.8ms
Frame rate of parent=1/(Ta+Tc)=8.95 frames/second.
Frame rate of child=1/(Tc)=3.57 to 1.39 frames/second.
The actual frame rate will be lower due to the above reasons, and also for the fact that a single CPU is executing both processes, which are competing for time slices. In table 3 below the parent process which identifies the track is PID 804 and uses 18.06% of the CPU time, while PID 805 is the child process which drives the car and uses 50.16% of the CPU.
Table 3:
process statistics when the multithreaded navigation system is runningIRIX indy17 6.2 IP22 load averages: 1.98 1.29 0.79 16:11:04
63 processes: 60 sleeping, 2 ready, 1 running
CPU: 16.7% idle, 9.4% usr, 70.9% ker, 0.0% wait, 0.0% xbrk, 3.0% intr
Memory: 96M max, 88M avail, 62M free, 100M swap, 100M free swap
PID PGRP USERNAME PRI SIZE RES STATE TIME %WCPU %CPU COMMAND
805 804 timb 100 4628K 1092K ready 1:15 46.44 50.16 car
804 804 timb 73 4628K 1092K ready 0:30 15.02 18.06 car
517 517 root 64 8260K 3880K sleep 1:45 6.55 8.62 Xsgi
809 809 timb 61 1960K 840K run/0 0:00 1.25 1.53 top
141 141 root 60 1364K 508K sleep 0:00 0.03 0.16 routed
594 562 timb 60 3412K 1152K sleep 0:02 0.18 0.12 xclock
637 562 timb 60 4024K 1556K sleep 0:01 0.07 0.08 xterm
3 0 root +39 0K 0K sleep 0:00 0.04 0.03 bdflush
172 0 root 61 0K 0K sleep 0:00 0.00 0.01 nfsd
173 0 root 61 0K 0K sleep 0:00 0.00 0.00 nfsd
By changing the process's priorities a higher frame rate can be achieved at the cost of reducing the accuracy in measuring hold times for commands sent to the car. The CPU percentages shown in table 3 were used in this project, and the resulting frame rates are shown in section 5.4.
5.2 Velocity measurement results
Method 1
For realistic car velocities of around 0.4 m/s, a shift in camera position of 40cm would occur between two frames operating at 2 frames per second. When the car was pointed at existing furniture in the room, for a forward shift of 40 cm it was found that 80 scan lines was deep enough to ensure the basis assumption. Restated: the amount of identifiable horizontal and vertical edges appearing/disappearing in the search window is small compared to the total amount of data in the window. It was necessary to use 4*4 pixel sub blocks in order to edge detect objects in the room, since at higher resolutions a large proportion of edges were not identified correctly in both frames. The percentage of pixels examined is then 13.9 %, and the processing time is 20*10.7ms = 214ms for the edge detect (the number of edges and the mean position of them is accumulated during the edge detection). This results in a frame rate of 1.65 to 0.96 and a track identification latency of 325ms in the simple navigation system. The multithreaded system would have to be used in which the frame rate is 3.08 and the track identification latency is 0 to 325ms.
Figure 9 shows a plot of the number of edges per scan line for two frames with a forward camera shift of 40cm and a 15° rotation. Line number 0 to 20 are the 20 4*4 averaged scan lines, with line number 0 being the highest position on the screen. By shifting the data of first frame(red) vertically upward by using the two highest peaks in each plot for matching to the second frame(black), the y component of the flow field is calculated as 2.

Figure 9: a vertical shift represents the y component of the flow field
Unfortunately this method does not provide enough resolution in measuring the flow field. A larger window is therefore required, since reducing the pixel sub block size has the associated problem of misidentifying the same edge in both frames. For this reason it is not plausible to use this method since the larger search window required would be too computationally expensive.
Method 2
For the alternative method in which a small area of pixels in the first frame's search window is matched in the second frame's search window, a shift in both the x and y directions represents a forward camera shift. The height of the search windows are 80 scan lines comprising of 20 4*4 averaged scan lines and hence processing times are very similar those calculated in the first method. A pixel block size of 4*4 was chosen to improve noise effects and to reduce the effect of pixel fluctuations due to the fluorescent lighting in the room (which did not affect the edge detection in the first method). The area of pixels from the first frame that matching was performed with, was a sub string of the central 4*4 scan line (horizontal width is 192). Measurements were taken for a forward camera shift of 40cm repeated 30 times, with no rotation.
Table 4 shows the results of using various sized sub strings.
Table 4: matching effectiveness of different sub string lengths|
Sub string length |
Number of correct matches |
Number of incorrect matches |
Processing time in milliseconds |
|
24 |
13 |
17 |
227 |
|
40 |
17 |
13 |
236 |
|
100 |
29 |
1 |
271 |
When the search windows are placed at the focus of expansion of the flow field, the measured pixel shift was only 1 unit for a camera shift of 40cm, which is not fine enough resolution for this project. By moving the search window off the centre of expansion as in figure 10 a and b (the gray scale bands near the top of the image), the measured shift was increased to 5 units for a forward camera shift of 40cm.

Figure 10a: frame 1
Figure 10b: frame 2, forward shift of 40cm
As stated in section 4.4 camera rotations would cause mismatching as can be seen in table 5 when a forward camera shift of 40cm and a 15° rotation was tested.
Table 5: incorrect matching due to rotation:|
Sub string length |
Number of correct matches |
Number of incorrect matches |
Processing time in milliseconds |
|
100 |
5 |
25 |
271 |
This method could be used as a measure of forward velocity when the car is travelling in a straight line but not during corners. Since the velocity of the car is required all the time, it is not plausible to use this method for navigation.
5.3 The effect of improved path planning

Figure 11: The testing track
The testing track used for all of the navigation methods researched in this project is shown in figure 11. The total length of the track is 9.1 metres long, with the numbered lengths being: 1 = 130cm, 2 = 60cm, 3 = 12cm, and 4 = 23cm.
Initially a single move was use to compensate for a bearing error when a corner was detected. In this method track identification is performed at a distance of 30cm ahead of the car's bumper. The controller used (and resulting error plot) is shown in appendix 4 and is referred to here as Controller 1.
By implementing the improvements suggested in section 4.5, the track identification is done at a distance of 39cm ahead of the car and uses two moves to perform the error compensation (a forward move followed by a turn move). The controller and error plot are found in appendix 5, and referred to here as controller 2.
This distance of 39cm was found to be the furthest distance that could be used, since the lack of velocity information placed an upper limit to the size move allowed before the accuracy of the move was adversely affected.
.Table 6 shows the results of implementing both control methods.
Table 6:
increased path planning results|
Controller |
Total time (sec) |
Frame rate (frames/sec) |
Number of times the car saw 1 line |
Number of times the car reversed with its wheels turned |
Number of times the car ran off the track and reversed |
|
1 (appendix 4) |
44.3 |
2.187 |
15 |
11 |
2 |
|
2 (appendix 5) |
35.1 |
2.194 |
8 |
6 |
2 |
Using controller 2 the car identified both track edge borders more often and required less recovery to stay on the track compared with using controller 1. The total time required for one lap was reduced by 26% when controller 2 was used in place of controller 1. Also, the accuracy with which the car stayed on the track was improved by using controller 2, since the error plot varied between 28 and -46 compared with 37 to -53 for controller 1.
For these reasons the control strategy in controller 2 was used for the remainder of this project.
5.4 Effectiveness of the low level controllers
Table 7 shows the results obtained using various controllers for the testing track in section 5.3. The open loop controllers used were: controller 2: 12 position (appendix 5); controller 3: 200 position (appendix 6), and controller 4: 200 position multithreaded execution (appendix 7). The closed loop controllers used were: controller 5: 12 position (appendix 8); controller 6: 200 position (appendix 9), and controller 7: 200 position multithreaded execution (appendix 10).
The error plots are also included in the respective appendices.
Table 7:
controller effectiveness|
Controller |
Total time (sec) |
Frame rate (frames/sec) |
Average length of generated commands in seconds |
Number of times the car saw 1 line |
Number of times the car reversed with its wheels turned |
Number of times the car ran off the track and reversed |
|
2 (appendix 5) |
35.1 |
2.194 |
0.344 |
8 |
6 |
2 |
|
3 (appendix 6) |
29.5 |
1.794 |
0.446 |
2 |
1 |
3 |
|
4 (appendix 7) |
22.8 |
PARENT: 6.158 CHILD: 1.626 |
0.615 |
20 |
5 |
2 |
|
5 (appendix 8) |
34.4 |
1.686 |
0.481 |
7 |
5 |
0 |
|
6 (appendix 9) |
26.0 |
1.731 |
0.466 |
5 |
3 |
2 |
|
7 (appendix 10) |
28.1 |
PARENT: 6.192 CHILD: 1.281 |
0.781 |
14 |
3 |
1 |
Open loop performance
Controller 3 is superior to controller 2, since the finer resolution of track division improved the accuracy of the car. This allowed the gain of controller 3 to be larger than that of controller 2, so that a smaller number of moves covering more distance could be used. This resulted in a 19% speed up in total time, with the amount of recovery still being less than that required by the slower operation obtained using controller 2. A comparison of the error plots shows that even with the increased gain of controller 3, the error plot is still marginally improved. The error plots also indicate an improvement to the smoothness of the car.
A reduction in the track identification latency time was implemented by using controller 3 in the multithreaded mode. Commands generated to the car are more up to date in this mode since the time delay associated in identifying the track has been reduced. Unfortunately it is difficult to measure the particular reduction in track identification latency each time, and this uncertainty causes a small reduction in the accuracy of the car. However the reduction in track identification latency did allowed the use of larger moves by increasing the gain further, as is seen in controller 4, and resulted in a speed up of 29%. The error plot for controller 4 contains two large positive errors with the flat spots representing recovery times. These large errors are caused by the multithreaded mode in which the operating system is switching between the child process driving the car, and the parent process acquiring frames. If this switch occurs near the time when the child process should send a stop control signal to the car, then the car will continue acting on this command while the child is sleeping, resulting in up to a 30% error in measuring braking times (braking is achieved by reversing the motor). This introduces a new error in measuring an amount of time accurately. The expected increase in accuracy achieved by reducing the track identification latency time is thus offset by the introduction of this new error. For this reason controller 4 is far less accurate than controller 3.
A solution to measuring time accurately in the multithreaded mode would be to use a computer containing two CPUs. Alternatively the process driving the car could be replaced by a circuit connected to the serial port which would send commands to the car, and measure time by counting the ticks of its own clock
Closed loop performance
| © T . B r u t o n 1 9 9 9 |
w w w . c s s e . m o n a s h . e d u . a u |
The method of error feedback used in controllers 5, 6 and 7, used the current error and the previous error for deciding on gain adjustments. If there was a sign change in the error then the system was considered to over shoot and the gain was decreased. If there was no sign change in the error then the system was considered to under shoot and the gain was increased. The change in gain used was equal to the current error * 1% of the gain. The feed back controllers shown in the appendices are the result of using this above error feedback method on the testing track used in this report.
Controller 5 is the closed loop version of controller 2. By introducing feedback a 2% speed up of the car was obtained. This speed up is due to the car spending less time in recovery mode, due to smoother operation (as evident in the error plots). The frame rate is slower in controller 5 than in controller 2 since the error feedback had a net effect of increasing the turning gain, which causing longer moves to be sent to the car.
Introducing feedback to controller 3 results in the closed loop version named controller 6. A 13% speed up was obtained via the increased gain in controller 6. Even with this large increase in speed, the error plots and number of recovery actions taken are similar for both controllers.
Introducing feedback to controller 4 increased the gain of the system as shown in the plot of controller 7. A 23% reduction in speed was measured, and is explained by the increased amount of braking. The obvious solution of using less braking caused the car to spend most of its time in recovery mode, since speed regulation is difficult to achieve when large gains are present and no measure of velocity is available.
Overall Performance
The fastest time of 22.8 seconds was measured using controller 4 (the open loop 200 position multithreaded controller) due mainly to the reduced track identification latency times which allowed larger gains and resulted in the second largest sized commands. This controller was the most inaccurate, since not knowing the speed of the car caused it to run wide through a corner when it was travelling slowly, and when it was moving fast it would turn too tight. As a trend the larger the generated moves, the worse the accuracy got. This is confirmed by comparing controller 3 with controller 6, since these controllers are similar except that controller 6 generates larger moves and experiences reduced accuracy.
The most accurate controller was controller 3 (the open loop 200 position controller), since it did not have the error in time measurement associated with the multithreaded controllers. It also had small controller gains (resulting in the second smallest sized commands) which reduced the effects of track identification latency in this simple navigation system.
In considering both speed and accuracy the best performance was from controller 6 (the closed loop 200 position controller) since it was the second most accurate, and also the second fastest. The accuracy is due to this controller not having the errors associated with the multithreaded systems, and the fact that command generation size was the third smallest and hence the effects of track identification latency were small. The gain was high enough however to generate commands which were large enough to allow a moderate car speed, and a good balance between speed and accuracy was obtained.
6. Future research recommendations
This section highlights ideas which further research on this project could implement in order to see an increase in the speed and accuracy of the car.
Velocity
The second method researched in section 5.2 could be modified by centering the search window directly on the focus of expansion. This would reduce the y component of the flow field to zero and hence reduce the search window to one scan line in height. Processing the left half of the screen separately to the right half, would then allow determination of forward camera shifts and rotations.
Path planning
Instead of just measuring the middle of the track at one position only, if two different positions are measured it is then possible to calculate the heading of the track, so that both the car’s position and bearing errors are known. Then by using the velocity of the car to calculate its turning curvature, a correction for both the position and bearing error can be achieved by transforming the car’s position and bearing to equal that of the track. This would be done by using a straight line (forward move by the car) and two circular arcs (2 turn moves by the car), and would result in placing the car in the centre of the track and parallel to it.
Obstacle avoidance
Since any moving obstacles will be located near the track, by using multiple scan lines spaced closer than the expected size of the such obstacles, it would be possible to identify them at the same time as the track. Using 10 scan lines will only increase the track recognition time from 109ms (using one scan line) to 137ms and would not adversely reduce the speed or accuracy of the car.
The robotic car designed and constructed in this report implemented real time control by integrating the sub-systems of: powering the video equipment; communication to and from the car; real time track recognition; path planning; real time control; and vehicle recovery. The initial speed obtained from navigating the testing track in section 5.3 was 0.20m/s. The fastest speed obtained was 0.40m/s using a multithreaded approach with improved path planning.
The recognition of the track was ultimately performed using an edge detection algorithm processing only 0.17% of each video frame, while still successfully identifying the track in the presence of noise.
The total time required to identify the track was 109ms(acquire frame) + 2.8ms(track recognition) resulting in a frame rate of 8.9 frames/second. It is quite obvious that further improvements to track recognition will not speed up the car, since the bottleneck is in the time required for the computer to digitize a video frame.
Overall planning was improved by looking further down the track and using two moves to compensate for bearing errors, instead of only using one move. This reduced the time taken to navigate the testing track from 44.3 to 35.1 seconds, with an improvement in accuracy.
It was found that to increase vehicle speed larger moves generated at a slower rate are required. There was however a maximum size of generated moves before loss of control of the car occurred. The first factor limiting move size was that, since vehicle speed was not known, if the car was moving slowly it would run wide through a corner, and, if the car was moving fast it would turn too tight. The larger the generated moves, the worse the accuracy became. The second limiting factor was the time delay in acquiring a frame and identifying the track. This resulted in the sequence of moves sent to the car being out of date by this amount of time (referred to in this report at track identification latency). The effect of this latency is more pronounced for larger moves. Track identification latency was reduced by continually identifying the track and driving the car simultaneously, as in the multithreaded mode, in which the operating system is switching between the child process driving the car, and the parent process acquiring frames. When the child process finishes driving the car the amount of time it must wait for the track identification to be completed is thus 0ms up to a time which is no worse than the track identification latency in the simple navigation system. Unfortunately it is difficult to measure the particular track identification latency each time, and this uncertainty causes a reduction in the accuracy of the car. Also when an operating system switch occurred near the time when the child process should have sent a stop control signal to the car, the car continued acting on this command while the child was sleeping, resulting in up to a 30% error in measuring braking times. Using this multithreaded method in open loop mode achieved the fastest measured time of 22.8 seconds (generating the second largest sized commands), but displayed the worst accuracy. By using two CPUs or, alternatively, a hardware controller to send and measure the generated commands, this accuracy would be increased without a reduction in speed. Alternatively control of the car at its top speed could be achieved using the simple navigation system on a faster computer, so that the track identification latency is no longer a limiting factor.
The most accurate controller was controller 3 (the open loop 200 position controller), since it did not have the error in time measurement associated with the multithreaded controllers, and also had small controller gains (resulting in the second smallest sized commands) which reduced the effects of track identification latency.
In considering both speed and accuracy the best performance was from the 200 position closed loop controller, giving the second fastest time of 26.0 seconds. As above this method did not have the multithreaded controller problems, and since the command size was the third smallest the effects of track identification latency were small.
The best navigation system that could be used for this project would be the 200 position closed loop multithreaded controller with the modifications of: the hardware board for timing and sending commands to the car to avoid timing errors; and the inclusion of velocity measurement for gain adjustment.
The first method of velocity measurement of the car was not plausible due to the prohibitive amount of processing time required. The second method investigated in this report was not as computationally expensive, however this method could not be used due to its inability to handle camera rotation.
Unfortunately obstacle avoidance was not implemented, but future research suggestions were made.
Allthings Sales and Service, P.O Box 25 Westminster, Western Australia 6061. Telephone: (08) 9349 9413. "Allthings Sales and Service." MUD History. 1999. http://www.allthings.com.au/ (3 May. 1999).
C.R. Brice and C.L.Fennema, 1970 "Scene analysis using regions," Artificial Intelligence vol.1, pp.205-226.
S-J. Chang, C-W. Cheng, T-H S. Li 1997 "Design and implementation of fuzzy garage-parking control for a PC-based model car," IEEE IECON, pp 1299-1304.
N. Chen, 1997 "A vision-guided autonomous vehicle: an alternative micromouse competition," IEEE Transactions on Education vol.40, no.4, Nov. pp253-258.
C.K. Chui, G. Chen, 1987 Kalman filtering with real time applications, New York: Springer-Verlag.
R.O. Duda, 1970 "Current techniques for scene analysis," Artificial Intelligence Group, Technical note 46 (SRI Project 8259).
R.O. Duda and P.E. Hart, 1972 "Use of the Hough transformation to detect lines and curves in pictures," Communications of the ACM vol.5, no.1, Jan. pp11-15.
S. Fejes and L. Davis, 1998 "What can projections of flow fields tell us about the visual motion," 6th International Conference on Computer Vision Jan.4-7, pp 978-986.
V. Graefe, K-D. Kuhnert, 1992 Vision-based autonomous road vehicles in: Vision-based vehicle guidance, I. Masaki (ed.), New York: Springer-Verlag, pp 1-29.
B.K.P. Horn, 1986 "Robot vision," MIT Press.
P.V.C. Hough, 1962 "Method and means for recognising complex patterns, " U.S. Patent 3069654, Dec.18.
T. Ito and S. Kawakatsu, 1992 An extracting method of the optical flow for an anticollision system in: Vision-based vehicle guidance, I. Masaki (ed.), New York: Springer-Verlag, pp 255-267.
R.A. Jarvis, 1973 Clustering using a similarity measure based on shared near neighbors in: Computer vision and robotics lecture notes, Melbourne: Monash University.
R.A. Jarvis, n.d. Distance transform based path planning for robot navigation in: Computer vision and robotics lecture notes, Melbourne: Monash University.
R.A. Jarvis, 1975 "Image segmentation by interactively combining line, region, and semantic structure"Proceedings on computer graphics, pattern recognition, and data structure, pp 279-288.
Jaycar Electronics, 887-889 Springvale Road Mulgrave, Victoria 3170. Telephone: (03) 9547 1046. "Jaycar Electronics in Cyberspace." MUD History. 1999. http://www.jaycar.com.au/NEWSITE/home.htm (3 May. 1999).
A. Megstel, 1991 Autonomous mobile robots. Singapore: World Scientific Publishing Co. Pte. Ltd.
K. Ogata, 1970 Optimal and adaptive control systems in: Modern control engineering, New York: Prentice Hall, pp 750-811.
M. Ohkita, H. Miyata, M. Miura, H. Kouno, 1993 "Travelling experiment of an autonomous mobile robot for flush parking," IEEE International conference on Fuzzy systems vol.1, pp327-332.
N.K. Paragios, R. Deriche, 1998 "A PDE-based level-set approach for detection and tracking of moving objects," 6th International Conference on Computer Vision Jan.4-7, pp 1139-1145.
G.M. Quenot, J. Pakleza, T.A. Kowalewski, 1997 Particle image velocimetry with optical flow in: Experiments in fluids, New York: Springer-Verlag, pp 177-189.
G. Rzevski, 1994 Control in: Mechatronics perception cognition and execution, United Kingdom: The Open University, pp 219-262.
G. Saalgian and D.H. Ballard, 1998 "Visual routines for autonomous driving," 6th International Conference on Computer Vision Jan.4-7, pp 876-882.
B. Shirinzadeh, 1997 "Vision-based tracking and control of remotely controlled vehicles," Industrial robot vol.24, no.4 pp 297-301.
Total Turnkey Solutions, P.O Box 322 Coburg, Victoria 3058. Telephone: (03) 9350 7377. "Total Turnkey Solutions." MUD History. 1999. http://www.turnkey-solutions.com.au/ (3 May. 1999).
D. Vernon, 1998 "Computation of instantaneous optical flow using the phase of Fourier components," Image and Vision Computing, pp 189-199.
A. Zalzala, A. Morris 1996 Neural networks for mobile robot piloting control in: Neural networls for robotic control, United Kingdom, Hertfordshire: Ellis Horwood Limited, pp 137-161.
A. Zalzala, A. Morris 1996 A neural network controller for the navigation and obstacle avoidance of a mobile robot in: Neural networls for robotic control, United Kingdom, Hertfordshire: Ellis Horwood Limited, pp 162-191.
These screen captures show the calculated mid point (green dot), which is located 39cm in front of the car. The intensity, averaging, and edge detection plots in appendices 2 and 3 are for the scan line positioned here. The same top/bottom ordering applies.


8 bit gray scale intensity plots for the scan lines in the screen
captures in appendix 1. The top and bottom plots correspond with the top and
bottom screen captures in appendix 1.
Appendix 3 – Scan line averaging and edge detect plots
The top and bottom plots correspond with the same ordering in appendix 1 and 2. The black line is a 6*1 averaged plot of the intensity scan line in appendix 2. The red line is a plot of the edge detection being performed on it.

Appendix 4 – Open loop 12 position simple path planning controller and error plot

The top plot is the controller
used looking 30cm ahead with no path planning. The bearing error is on the x
axis, the output to the car is on the y axis. These plots are for one half of
the track, the other half is the mirror image. The black line represents the
forward move time, the blue is the turn time, and the red is the brake time
(reverse motor). The error plot which was obtained from navigating the
testing track is the bottom plot.

Appendix 5 – Open loop 12 position controller and error plot
The improved path planning controller plot includes an additional forward move for errors around 10 compared to appendix 4, otherwise they are identical. The car is looking ahead by 39cm. The error plot for navigation of the testing track is the bottom image.

Appendix 6 – Open loop 200 position controller and error plot


The resolution of this controller is 200, 100 for each side of the
track. The colouring code is: BLACK for forward, BLUE for turn, RED for
brake. The error plot was obtained from navigation of the testing track.
Appendix 7 – Open loop 200 position controller, multithreaded execution and error plot


This controller was used in the multithreaded mode and has a
larger gain than the similar controller in appendix 6. The error plot
obtained from navigating the testing track is the bottom image.
Appendix 8 – Closed loop 12 position controller and error plot


This controller is the result of introducing feed back to the
controller in appendix 5, by navigating the testing track. The error plot is
also shown.
Appendix 9 – Closed loop 200 position controller and error plot


This controller is the result of applying feed back to the
controller in appendix 6 by navigating the testing track. The error plot is
also shown.
Appendix 10 – Closed loop 200 position controller, multithreaded execution and error plot


This controller is the result of applying feedback to the controller in appendix 7.
The error plot obtained from navigating the track is also shown.
HTML: 3/2000 L. Allison - who tried to curb the worst excesses of MS-Word, which may have caused some loss of information.