Posted 04/10/15
After an exhaustive (and exhausting!) set of ‘indoor range’ tests that (I thought) gave me a very good understanding of the ‘stuck’ detection issue, I made the changes I thought were necessary and sent Wall-E back out into the real world – where he promptly got stuck and didn’t recover! He got stuck climbing up onto the lip of a rug – and sat there merrily grinding away for what seemed like forever (but was only for a minute or so) before I took mercy on it.
Clearly the situation ‘in the field’ isn’t quite as simple as my ‘indoor range’ configuration, but the differences are not obvious. In an effort to figure this out without running around in circles, I’m trying to change just one thing at a time, as follows:
- Changed the ‘STUCK_DIST_DEVIATION_THRESHOLD’ from 5 cm to 10 cm. This helped a little, and didn’t seem to increase the frequency of false positives significantly.
- Changed the MAX_DISTANCE_CM from 200 cm to 100 cm, on the theory that in the ‘real world’ there is more clutter beyond 100 cm that can cause significant measurement deviation. This change caused Wall-E to declare a ‘stuck’ condition almost continually – and I have no idea how THAT happened!
- Changed the MAX_DISTANCE_CM back to 200 to verify that Wall-E’s behavior changed back to what it was before the change. Check.
- Changed the MAX_DISTANCE_CM back to 100 and removed the guard code around the call to UpdateWallFollowMotorSpeeds() in MoveAheadTilStuck(). Changing the MAX_DISTANCE_CM back to 100 caused the false ‘stuck’ declarations to resume, and removing the guard code had no effect one way or the other.
So, what’s the deal with changing the MAX_DISTANCE_CM parameter? It is only used in two places in the code – in the NewPing() constructor for all four sensors, and in the line ‘frontdistval = (frontdistval > 0) ? frontdistval : MAX_DISTANCE_CM + 1; in MoveAheadTilStuck(). This line converts a zero reading from the front sensor to MAX_DISTANCE_CM + 1 (101 in this case). Since I’m no longer using the front sensor reading for the ‘stuck’ determination, I have no clue why this line (or lack of it, for that matter) would make any difference.
The only other potential clue in this whole mess is the way the sensor reading arrays are being handled. The idea was that when a ‘stuck’ detection occurred, The arrays should be re-initialized in such a way that another ‘stuck’ detection could not occur until after another ARRAY_SIZE measurements have been collected. The way I chose to do that was to simply place a large positive reading followed by a zero in the top of each of the 4 arrays, guaranteeing (I thought!) that those two adjacent values would prevent a ‘stuck’ detection for at least ARRAY_SIZE measurement cycles. In order to verify that this ‘poison pill’ feature is actually working, I added the ‘PrintDistInfo()’ function from my PingTest project to Wall-E4 and ran it in debug mode on my bench. Using this technique, I was able to watch (albeit slowly) the ‘poison pill’ values roll through my distance sensor value arrays. So, it appears that is working fine, and the ‘stuck’ detection algorithm is working perfectly, too – in that it detects the ‘stuck’ condition as soon as it is able too (all the real distance information is pretty static with Wall-E sitting on the bench with no power to the motors)
So, the only conclusion i can reach with this information is that the MAX_DISTANCE_CM reduction from 200 to 100 significantly reduced measurement deviation, to the point where Wall-E was declaring ‘stuck’ even when he wasn’t. This tracks with another observation – Wall-E seemed to declare ‘stuck’ just as the distance from one or the other side sensors increased, like an open door or something like that. Apparently this causes a ‘out of bounds’ (zero) return with a MAX_DISTANCE_CM of 100 more often than with 200.
So, what to do? I can just use the differential distance readings between the front and top-front sensors, but while this should work for the slipper case where the front sensor is partially or totally obstructed, it won’t work for the coat rack or rug edge case where both front sensors are unobstructed. It might work to use a two dimensional test; if the two front sensors have close to the same readings but those readings don’t vary over time, OR their readings differ significantly at any time, then declare ‘stuck’. If I go this way, I’ll need to open up the front sensor max distance to something more than 200 cm (300-500?) so Wall-E won’t declare ‘stuck’ in an open hallway. Since the side sensors would no longer be used for the determination, I could keep their max distances short – say 100 cm, which would allow me to shorten the post-ping delays for them a bit.
- Change MAX_DISTANCE_CM to 400 cm. Use MAX_DISTANCE_CM for the two front sensors and MAX_DISTANCE_CM / 4 for the side sensors.
- Change the ‘stuck’ detection algorithm to use only the front sensors, as discussed above
- Remove the aRightDist and aLeftDist arrays.
- Change the inter-ping delays. It is generally a good idea to wait 20-25 msec between ping sensor triggers to avoid returns from one sensor being interpreted as returns by another sensor. However, I believe it is OK to have no delay between the left and right ping sensors. In order for ping energy from the left sensor to be interpreted as a return by the right sensor, that energy has to arrive at the right sensor after the right sensor has been triggered, and before the right sensor’s energy gets back. If the delay from left to right sensor activation is more than about 25 msec, there’s no way the first criteria (arriving after the right sensor is triggered) can be met, so this is perfectly safe, if a bit wasteful of time. However, if they are triggered together (no inter sensor delay), then there is no way the second criteria can be satisfied for any reasonable geometry, as the left sensor’s energy will always have farther to travel by 2 times the distance from the left sensor to the nearest object. So, I believe it is safe to trigger the left and right sensors together, then delay 15-25 msec between the L/R pair and either the top-front or front, and then another 15-25 msec between the two front sensors.
OK, so I made the changes described above, and Wall-E is still getting stuck, although less frequently than before. In fact, there were a couple of times during the last set of field trials where it seemed that Wall-E was actually doing very well. However:
- The wall following performance is still mediocre at best, especially compared to where it was before I started adding inter ping sensor delays.
- Wall-E still gets stuck and won’t declare ‘stuck’ for no apparent reason. In one case he had his nose pressed firmly up against a solid surface, which should have produced stable readings from both front sensors, but apparently satisfied neither the max deviation nor top-front/front deviation difference criteria. In another, both front sensors were unobstructed, and the nearest obstacle was only about 75 cm away – should have been a slam-dunk, but wasn’t.
At this point, I think the only way forward is to find a way to record what is actually happening with Wall-E during a period where it the ‘stuck’ criteria should be met, but nothing is happening. My hope is that I can figure out how to use Arduino’s EEPROM to record data ‘on the fly’.
Stay tuned!
Frank