Wednesday, March 30, 2005

Making sense of scenes

So it looks our approach of using only cameras as opposed to sonars/radars makes sense in the current literature on the subject. The key to understand how to do this will give an insight on issues related to cognition and its defects as I pointed out in my presentation to SCM.

DARPA Site Visit Location

Like the folks at Austin Robot Technology, we will be using the facilities at the Southwest Research Institute

For real training, we will also use their facilities in Sabinal, TX.

Thursday, March 17, 2005

How we are going beat the big guys and win the race

At this point of the game, we believe that only a small team will be able to win the DARPA race. Why ? Because we feel that most bigger teams have built up a lot of momentum within their own team and their partners. They have put together significant amount of money and effort (read internal politics) and are therefore likely to constrain themselves early in the choices of architectures, software languages, and more importantly in the ideas and algorithms needed to deal with the specificities of the navigation in some unknown outdoor environment.

For instance, Mike Montemerlo (whom I do not know) did his thesis at CMU last year on FastSLAM, a very good algorithm (and it is already available on the web). Yet, I am personally pretty sure that the CMU Red team never took advantage of his work...and they were in the same building! Mike seemed so convinced that his algorithm is good (and I believe it too) that he is now the head of software of the team at Stanford. I am pretty sure that he has pushed the envelope on making it better. My point still stands though: when you build a large team, a lot of conservativeness is built into the design that does not allow you to be quick at changing a strategy. And indeed the prevalent thinking is that you need a lot of money to outdo the others by adding the best sensors so that you can be lazy in "understanding" the data coming from them.

I am of the opinion that since Humans do not need radar to navigate, it is very likely that we are not making sense of the currently available data (images). What can a radar or a lidar tell you when you have to go through a small water hole ?

Do you use additional data such as GIS ? we all know that GIS is as good as the weather is for this type of problems.

We have all been told that a human needed to be behind the wheels to drive well. But we have all seen people doing different things while driving, like eating with chopsticks on the highway

and surely this affects their cognitive capabilities. As it turns out, it's been shown recently that hand free cell phones while driving were as perturbing than hand held phones. It all boils down to task switching and decision making. Once again, even in the human model, having more sensors makes you lousier because of the amount of coordination needed.

Our current algorithm development is focused on several high risk ideas (because they have not been tried in this area of a robot in some outdoor environment.) Here is a list of references to some of them:

With regards to the control of the car/robot we are looking into using results from Bayesian models for learning as it applies to infant and robots as well as the bayesian modeling uncovered in uncertainty in sensorimotor control in humans. Similarly, we are interested in the BIBA project (Bayesian Inspired Brain and Artefacts) with a particular interest in the application of these techniques to the Cycab (an autonomous golf cart for traveling in cities.) Using GIS data, we could go after map based priors or could use direct imaging on the ground to build the maps priors as we go.

In order to understand and make sense of images, we are looking into using
Best Basis Pursuit techniques for image decomposition and its very recent developments. This comes from results showing that natural scene statistics are likely to be sparse under a dictionary of edges and smoother functions.

If this former technique fails, we are also looking at distance and projection based techniques such as Automatic discovery using Google (google image), the Zip distance , SIFT Keypoint detectors, High dimension reduction through random projection as well as fast estimation of distances.

Illumination treatment in order to deal efficiently with shadows could be useful. Merging online estimation from an IMU and images will also prove most definitely useful.

Since color seems to be a good indicator of depth

maybe we really need a way to use this robust colorization technique to make better sense of images.

As one probably noticed, none of these research areas is really very old and therefore we do not expect the larger teams to take advantage of them. But we think they will make a difference. And oh by the way, if you think you want to help in the areas I just mentionned please get in touch with us.

Monday, March 14, 2005

One down and hopefully several to go !

Our complete application was received by DARPA. In the video, we are not showing autonomous driving as we had some unexpected difficulties, unrelated to the race. Our entry is pretty basic on the mechanical side and we expect most of the difference to be made at the level of how information gathered from our different sensors is exploited. It is one thing to accumulate a lot of data but it is far more important to make sense of them. This is why, for instance, we will not use a laser ranging system like most other teams.
On a related topic, I will make a presentation next week on what I think are some of the interrelated issues found in building an autonomous system like Pegasus Bridge 1 and cognitive problems found in humans (especially the little ones).

Thursday, March 10, 2005