Saturday, October 08, 2005
Wednesday, October 05, 2005
National Qualification Event
There are interesting videos from the NQE before selection is made on Thursday. They can be found on the CarTV site. In this video, Stanford provides insight in the vision system of their entry. Stanford has their own stock of videos.
Friday, September 30, 2005
DARPA Grand Challenge NQE underway.
The DARPA Grand Challenge National Qualification Event has taken place for the past several days, more information can be found on this brochure. A good blog on the event can be found here. Meanwhile, since we have been eliminated for this year's event, we are playing with Google Maps technology and are evaluating how it can be applied to our display of engineering data. John Wiseman did a great job of displaying last year's DARPA race through Google Maps. We just used it to display places on the Paris map for the "Nuit Blanche" event. We feel it has a good potential when applied to the data we are making freely available from our test runs.
Wednesday, June 08, 2005
Result of DARPA selection to NQE
Monday, May 30, 2005
How do you drive-by-wire a car using Python ?
Our vehicle used to use this program to be driven around through a laptop installed in the car. Someone located in the vehicle would punch keys on the keyboard and would be able to accelerate/brake and turn right/left. The commands would then go through the parallel port of the laptop to a controller which would in the end command several electrical steppers motors doing the manual job. We left the GPS thread in the code so that one can see how to use threads. In effect in the real program, we also use other threads to monitor cameras and other inertial information coming from the IMU but we left them out of this code for the sake of clarity. It runs under Windows and is fast enough for the purpose at hand. The drive-by-wire code enables us to drive the vehicle around and allows the gathering of data on maneuverability and collision avoidance. The sentences said by the program through the TTS module may look like a gimmick, but they are in fact very helpful when only one person drives around and collects data at the same time.
The program starts two threads. One is watching the keyboard for instruction. The other one is recording data from the GPS sensor.
# This code is released
# under the General
# Public Licence (GPL),
# DISCLAIMER : if you do not know
# what you are doing
# do not use this program on a
# motorized vehicle or any machinery
# for that matter. Even if you know
# what you are doing, we absolutely
# cannot be held responsible for your
# use of this program under ANY circumstances.
# The program is
# released for programming education
# purposes only i.e. reading
# the code will allow students
# to understand the flow of the program.
# We do not advocate nor condone
# the use of this program in
# any robotic systems. If you are thinking
# about including this software
# in a machinery of any sort, please don't do it.
#
# This program was written by
# Pedro Davalos, Shravan Shashikant,
# Tejas Shah, Ramon Rivera and Igor Carron
#
# Light version of the Drive-by-wire program
# for Pegasus Bridge 1 - DARPA GC 2005
#
import sys, traceback, os.path, string
import thread, time, serial, parallel
import re
import struct
import msvcrt
import pyTTS
rootSaveDir = "C:\\PegasusBridgeOne\\"
gpsSaveDir = os.path.join(rootSaveDir,"gps")
ctlSaveDir = os.path.join(rootSaveDir,"control")
stopProcessingFlag = 0
printMessagesFlag = 1
p = None
tts = pyTTS.pyTTS()
gps_data = list()
imu_data = list()
def runnow():
print "Drive-by-wire program running now"
#launch controller thread
t = thread.start_new_thread(control_sensors,())
#launch gps sensor thread
t1 = thread.start_new_thread(read_gps_data,())
def control_sensors():
global stopProcessingFlag
global ctlSaveDir
global p
printMessage("Control thread has started")
p=parallel.Parallel()
p.ctrlReg = 9 #x&y axis
p.setDataStrobe(0)
sleep = .01
while (stopProcessingFlag != 1):
key = msvcrt.getch()
length = len(key)
if length != 0:
if key == " ": # check for quit event
stopProcessingFlag = 1
else:
if key == '\x00' or key == '\xe0':
key = msvcrt.getch()
print ord(key)
strsteps = 5
accelsteps=1
if key == 'H':
move_fwd(accelsteps, sleep)
if key == 'M':
move_rte(strsteps, sleep)
if key == 'K':
move_lft(strsteps, sleep)
if key == 'P':
move_bak(accelsteps, sleep)
print "shutting down"
ctl_ofile.close()
def read_gps_data():
global gpsSaveDir
global stopProcessingFlag
global gps_data
printMessage("GPS thread starting.")
#Open Serial Port with baud 4800 on port 5 (Com6)
ser = None
try:
gps_ofile = open(''.join( \
[gpsSaveDir,'\\',time.strftime**
**("%Y%m%d--%H-%M-%S-"),**
** "-gps_output.txt"]),"w")
ser = serial.Serial(5, 38400, timeout=2)
while (stopProcessingFlag != 1):
resultx = ser.readline()
# if re.search('GPRMC|GPGGA|GPZDA', resultx):
splt = re.compile(',')
ary = splt.split(resultx)
gps_ofile.write(str(get_time() + ', '+resultx))
gps_data.append([get_time(), resultx])
if (len(gps_data)> 100): gps_data.pop(0)
ser.close()
gps_ofile.close()
except:
write_errorlog("GPS Read method " )
if ser != None:
ser.close()
def write_errorlog(optional_msg=None,whichlog='master'):
if optional_msg!=None:
writelog(whichlog,"\nERROR : "+ optional_msg)
type_of_error = str(sys.exc_info()[0])
value_of_error = str(sys.exc_info()[1])
tb = traceback.extract_tb(sys.exc_info()[2])
tb_str = ''
# error printing code goes here
def get_time(format=0):
stamp = '%1.6f' % (time.clock())
if format == 0:
return time.strftime('%m%d%Y %H:%M:%S-', \
time.localtime(time.time())) +str(stamp)
elif format == 1:
return time.strftime('%m%d%Y_%H%M', \
time.localtime(time.time()))
def if_none_empty_char(arg):
if arg == None:
return ''
else:
return str(arg)
def writelog(whichlog,message):
global rootSaveDir
logfile = open(os.path.join(rootSaveDir, \
whichlog+".log"),'a')
logfile.write(message)
logfile.close()
printMessage(message)
def printMessage(message):
global printMessagesFlag
if printMessagesFlag == 1:
print message
def move_fwd(steps, sleep):
global p
for i in range (0,steps):
p.setData(32)
p.setData(0)
time.sleep(sleep)
def move_bak(steps, sleep):
global p
for i in range (0,steps):
p.setData(48)
p.setData(0)
time.sleep(sleep)
def move_lft(steps, sleep):
global p
for i in range (0,steps):
p.setData(12)
p.setData(0)
time.sleep(sleep)
def move_rte(steps, sleep):
global p
for i in range (0,steps):
p.setData(8)
p.setData(0)
time.sleep(sleep)
print 'Hello'
tts.Speak("Hello, I am Starting the control**
**Program for Pegasus Bridge One")
time.sleep(4)
runnow()
while (stopProcessingFlag != 1):
time.sleep(5)
print 'Bye now'
tts.Speak("Destination reached, Control Program**
**Aborting, Good Bye, have a nice day")
time.sleep(4)
Wednesday, May 25, 2005
On being innovative - part I
The same thing will happen if you're running a startup, of course. If you do everything the way the average startup does it, you should expect average performance. The problem here is, average performance means that you'll go out of business. The survival rate for startups is way less than fifty percent. So if you're running a startup, you had better be doing something odd. If not, you're in trouble.
While we are not going to talk about eveything on how we approach the vehicle for the Grand Challenge, we can describe how we are doing something odd:
Using the Sensors We Understand
Our approach to the Race is to have no range sensor like Lidars. Why ?
- first, because everybody seems to be doing it and it does not make sense to differentiate ourselves on how much money we can spend on a specific sensor, we know we can easily be outspent,
- second, because we do not see a good way for the use of the technology developed for this project if we are not making more sense of data that are generally gathered by humans to make decisions. In the end, there are far more possibilities to use research based on sensors for which humans are accustomed to, than if one were to make sense of data from specifically crafted sensors like Lidars. There is a cost associated with devising new sensors in a new range of wavelengths, but the central issue is not having made sense of the data one already had access to. If you have been so unsuccessful at making sense of data from sensors you have been using all your life (at least most of us have had vision) why do you think you are going to be more successful with other types of sensors ? We are not saying it is easy. As the literature shows it is still a big issue in the robotics vision community but we think we have a good strategy on sensing and making sense of data.
- Third, it did not seem to help the competitors last year.
Eventually our interest converges on fusioning data from inertial and vision sensors in order to provide navigability as well as obstacle avoidance capability as humans can do with their eyes and vestibular system. This is a subject that has only been recently looked into in the academic community even though much has been done of Structure From Motion. A related interest is how gaze following tells us something about learning.
Besides GPS, IMU and visual data, we are using audio as well. The driving behavior of humans is generally related to the type of noise they hear when driving. We are also looking at fusioning this type of data with the video and the IMU. If you have driven in the Mojave desert, you know that there are stanger sounds than the ones you've heard on a highway. They tell you something about the interference between one of your sensors (the tires) and the landscape. This is by no means data to be thrown out since your cameras tells you see if the same landscape ahead of the vehicle is to be expected.
Making Data Freely Available and Enabling Potential Collaborative Work with People We presently Do Not Know
One of the reasons we put some data on our web site that features GPS, IMU (acceleration, heading and rotation rate) is because we want other people who do not have the means nor the interest to take a stab at the problem of setting up a system that can gather data to still be able to use real world data for some initial finding. More data will be coming within the coming months and we are looking for some type of academic/research institution to host these data. We have collected 40 gigabytes of data so far but with a monthly bandwidth allowance of 42 GB per month. We do not expect to survive any interest from the robotics community past the first person who downloads all of them (we have had more than 30 people who downloaded the larger file after making an announcement on robots.net a month and half ago.)
Teaming with other researchers that are specifically in this area is of interest but this is not our main strategy. The odd part of this strategy is the statement that data do not matter as much as making sense of them. It doesn't matter if you have driven 100 miles and gathered data with your vehicle if you have not done the most important: postprocessing.
Using the Right Programming Language
We are currently using the Python programming language. Why ? for several reasons,
- we do not want to optimize ahead of time the respective challenges occuring in building an autonomous vehicle, hence we are neither using C/C++, Java nor any real time operating systems for development and the final program. Most announcements on other teams web site look for Java, C/C++ programmers.
- Python is close to pseudo-code and the learning curve for newcomers into the team as well as veterans is very smooth especially coming from Matlab.
- Python is providing glue, libraries, modules for all kinds of needs expressed within the project (from linear algebra with Lapack bindings , to Image analysis, to plots to voice activation with PyTTS, to sound recording and others)
- Thanks to Moore's law, Python is fast enough on Windows but we do not feel we would be losing too much sleep having to switch to a different OS or having to create modules that are specifically faster in C. Using the Windows OS helps in debugging the programs since our team is distributed geographically and everyone generally use Windows on their laptops.
Using the Right Classification Approach
One of our approach to make classification of images for the purpose of road following as well as obstacle avoidance is based on defining some type of distance between drivable and non-drivable scenes. There are many different projection techniques available in the literature. Let us take the example of the "zipping-classifier" mechanism and how we are using it for obstacle avoidance. This technique uses a compressor like Winzip or 7-zip to produce some type of distance between files (here images.) A very nice description of it can be read here. The whole theory is based on Minimum Description Length. An example of a naive classification made on pictures we took at the site visit in San Antonio are shown below. In the interest of reproducibility, we used CompLearn to produce this tree. (Rudi Cilibrasi is the developer CompLearn.) The tree clearly allows the segmentation between open areas and the ones with trash cans. We did no processing to the pictures. Obviously, our approach is much more specific in defining this distance between a "good" road and a not-so-good one.
classification of trash can obstacle
Using the API for Google (images), we can build libraries of relevant information for road driving experience purposes and compute their "distance" to the road driven by our vehicle.
Let's hope we are doing something odd enough.
Monday, May 16, 2005
Gaze following when car racing - Review -
Anybody who has ever driven in the desert knows that one's reflexes gathered while driving on a highway do not work as well in this new environment. Point in case, the generic driving behavior one can observe in street car driving as shown in these two videos from gaze tracking vendors (1 and 2 or 3) and the roads being driven in the WRC (watch the free preview of the Cyprus Rally) which are likely to ressemble more the type of terrain one will observe in the Grand Challenge (ableit not at the same speed.)
We have been in talks with one of the vendors, but at 25-35 K$, there has not been a place where each of out interest and theirs have overlapped. Maybe some other time.
Why is gaze following important ? because this is the only way to reduce the field of vision to be decomposed for analysis. If one were to have a 640 by 480 camera, and tried to analyze the field of vision given by the camera, it is very likely that the analysis would take too much time for the vehicle to make a relevant decision (please note that I am not even talking about the way the computation is performed and the type of hardware used.) Not only that, but gaze following by a human is a filter for objects and scenes of relevance to the driving experience. If one were to have the same "reflexes" most of the driving experience would eventually be about choice and how one's speed is considered "safe" by the driver considering the terrain. I will talk about terrain some other time, in another review using the data we are making available.
Thursday, May 12, 2005
Site visit in San Antonio
Thursday, April 21, 2005
Site Visit Scheduled
Monday, April 04, 2005
Selected for site visit.
Saturday, April 02, 2005
Web site creation and free inertial/video data available
The reason we set it up is mostly to allow us to put a repository of some of our data we gather in the field with Pegasus Bridge 1 (the data include IMU, GPS and videos shot using the sensor module that is used in the vehicle. We make this data free to use by anyone who has an interest in them. We are refining the process of obtaining them so we expect some improvements in the format as we go along.
We now have drive-by-wire and autonomous capability
Wednesday, March 30, 2005
Making sense of scenes
DARPA Site Visit Location
For real training, we will also use their facilities in Sabinal, TX.
Thursday, March 17, 2005
How we are going beat the big guys and win the race
At this point of the game, we believe that only a small team will be able to win the DARPA race. Why ? Because we feel that most bigger teams have built up a lot of momentum within their own team and their partners. They have put together significant amount of money and effort (read internal politics) and are therefore likely to constrain themselves early in the choices of architectures, software languages, and more importantly in the ideas and algorithms needed to deal with the specificities of the navigation in some unknown outdoor environment.
For instance, Mike Montemerlo (whom I do not know) did his thesis at CMU last year on FastSLAM, a very good algorithm (and it is already available on the web). Yet, I am personally pretty sure that the CMU Red team never took advantage of his work...and they were in the same building! Mike seemed so convinced that his algorithm is good (and I believe it too) that he is now the head of software of the team at Stanford. I am pretty sure that he has pushed the envelope on making it better. My point still stands though: when you build a large team, a lot of conservativeness is built into the design that does not allow you to be quick at changing a strategy. And indeed the prevalent thinking is that you need a lot of money to outdo the others by adding the best sensors so that you can be lazy in "understanding" the data coming from them.
I am of the opinion that since Humans do not need radar to navigate, it is very likely that we are not making sense of the currently available data (images). What can a radar or a lidar tell you when you have to go through a small water hole ?
Do you use additional data such as GIS ? we all know that GIS is as good as the weather is for this type of problems.
We have all been told that a human needed to be behind the wheels to drive well. But we have all seen people doing different things while driving, like eating with chopsticks on the highway
and surely this affects their cognitive capabilities. As it turns out, it's been shown recently that hand free cell phones while driving were as perturbing than hand held phones. It all boils down to task switching and decision making. Once again, even in the human model, having more sensors makes you lousier because of the amount of coordination needed.
Our current algorithm development is focused on several high risk ideas (because they have not been tried in this area of a robot in some outdoor environment.) Here is a list of references to some of them:
With regards to the control of the car/robot we are looking into using results from Bayesian models for learning as it applies to infant and robots as well as the bayesian modeling uncovered in uncertainty in sensorimotor control in humans. Similarly, we are interested in the BIBA project (Bayesian Inspired Brain and Artefacts) with a particular interest in the application of these techniques to the Cycab (an autonomous golf cart for traveling in cities.) Using GIS data, we could go after map based priors or could use direct imaging on the ground to build the maps priors as we go.
In order to understand and make sense of images, we are looking into using
Best Basis Pursuit techniques for image decomposition and its very recent developments. This comes from results showing that natural scene statistics are likely to be sparse under a dictionary of edges and smoother functions.
If this former technique fails, we are also looking at distance and projection based techniques such as Automatic discovery using Google (google image), the Zip distance , SIFT Keypoint detectors, High dimension reduction through random projection as well as fast estimation of distances.
Illumination treatment in order to deal efficiently with shadows could be useful. Merging online estimation from an IMU and images will also prove most definitely useful.
Since color seems to be a good indicator of depth
maybe we really need a way to use this robust colorization technique to make better sense of images.
As one probably noticed, none of these research areas is really very old and therefore we do not expect the larger teams to take advantage of them. But we think they will make a difference. And oh by the way, if you think you want to help in the areas I just mentionned please get in touch with us.
Monday, March 14, 2005
One down and hopefully several to go !
On a related topic, I will make a presentation next week on what I think are some of the interrelated issues found in building an autonomous system like Pegasus Bridge 1 and cognitive problems found in humans (especially the little ones).