ESP32-CAMs Distance Measurement Study

Recently I have returned to working with WallE my autonomous wall-following robot, and I started thinking again about the issue it has with reflective surfaces. At the same time I ran across a post about using two ESP32-CAM modules for distance measurements, and I started to wonder if I could do something like that with WallE. I already have a visible red laser on the front, so maybe the two ESP32-CAM’s could use the laser ‘dot’ for distance measurements? Would this technique have the same problem with reflective surfaces?

I just happened to have two ESP32-CAM modules in my parts bin, so I thought I would give this idea a try and see how it goes. I know next to nothing about image processing in general and about the ESP32-CAM in particular, so if nothing else it will be a learning experience!

After a bit of web research, I got my Visual Studio/Visual Micro development environment configured for ESP32-CAM program development and for the ‘AI Thinkier EP32-CAM(esp32_esp32cam)’ target, and found a couple of examples that came with the newly-installed library. The first one I tried was the ‘CameraWebServer’ example (C:\Users\Frank\Documents\Arduino\Libraries\arduino-esp32-master\libraries\ESP32\examples\Camera\CameraWebServer\camera_pins.h), which turns the ESP32-CAM module into a webserver that can be accessed over the local network using any browser. The example provides for still images and real-time streaming – nice! However, I wasn’t interested in this capability, so after looking around a bit more I found an example that just takes still images and writes them to the SD card. I modified the code to convert the captured JPEG into BMP888 format so I could look at the individual color channels in isolation. I set the capture size to 128×128 pixels and capture a JPEG frame. The JPEG frame is just 2352 bytes, but the BMP888 conversion expands to 49206 bytes (128 x 128 x 3 = 49152, plus 48-byte header + 6 bytes at end, I think). Here’s the code at present:

and here are the JPEG and BMP888 versions of the 128×128 pixel image captured by the camera:

Picture29.jpg
Picture29.bmp

Then I copied Picture29.bmp to another file byte by byte, zeroing out the Green & Blue bytes so that only the red channel was non-zero. However, when I viewed the resulting file, I got the following image:

Picture29_red.bmp

This doesn’t make any sense to me, unless the byte ordering in a BMP888 file is BGR or BRG instead of RGB. However, when I researched this on the web, all info I found indicated the byte order in an RGB888 file is indeed R, G, B. It’s a mystery!

Here’s the code that produced the above results:

I posted the ‘why is my red channel blue?’ question to StackOverflow, and got the following comment back from

I think your problem is with the reference that you found. ISTR the colour order for RGB888 24 bits per pixel BMP is actually Blue, Green, Red. So your all “red” image will indeed appear blue if you have it backwards. See Wiki BMP & DIB 24 bit per pixel. BTW you can get some funny effects converting all red or all blue images from JPEG to BMP since the effective resolution at source is compromised by the Bayer mask sampling.

Well, at least I’m not crazy – my ‘red’ channel WAS actually the ‘blue’ channel – yay! Per the wikipedia article, the actual byte order is “… blue, green and red (8 bits per each sample)”

17 February 2025 Update:

After figuring out the BGR sequence, I moved on to the idea of locating a red laser ‘dot’ on a black background; here’s the experimental setup:

Experimental setup for ‘red dot on black background’ test

And here is the 128×128 pixel image captured by the ESP32-CAM.

So now I needed to find the coordinates for the red dot in the black field. Rather than deal with the tedium of writing and debugging the search routine in Arduino, I decided to suck the image data into Excel, and write a VBA script to find the ‘dot’, as shown below:

This produced the following Excel spreadsheet (scale adjusted to show entire 128×128 pixel layout):

128×128 RGB pixel data with max value highlighted

For comparison purposes, I have repeated the ESP32-CAM image here:

So, it seems pretty clear that I can correctly extract pixel values from the ESP32-CAM image and find the laser dot – at least in this contrived experiment with a non-reflective black background. Also, it appears at first blush like the upper left-hand corner of the ESP32-CAM image corresponds to R1C1 in the Excel spreadsheet.

The next step is to move the ‘dot’ to a significantly different location on the target and see how that effects the location of the max value in the grid – we need this to determine the orientation of the Excel data relative to the image data; maybe I got lucky, and maybe not 😉

02 March 2025 Update:

After setting this project aside for a few weeks, I figured out how to get the ESP32-CAM system to repeatedly grab images, convert them to BMP, and find the maximum red pixel value in the scene. Here’s the code:

When I ran this code in the following experimental setup, I was able to roughly map the row/column layout of the image, as shown:

As shown, the (0,0) row/column location is the upper right-hand corner of the image, and (127,127) is located at the bottom left-hand corner. At the 20cm spacing shown, the image boundaries are about 85mm height x 100mm width.

The next step will be to mount two ESP32-CAM modules on some sort of a frame, with the laser mounted halfway between the two.

06 March 2025 Update:

As part of my evil plan to use two ESP32-CAM modules to optically measure the distance to a laser red dot, I needed the two modules to talk to each other. The ESP32-CAM modules don’t really have the same sorts of two-wire communications facilities as do the various Arduino and Teensy modules, but I discovered there is an ‘ESP-NOW’ feature that provides ‘packet’ communications between ESP32 modules using the wireless ethernet channel. I found this tutorial that explains the feature, along with demo code for determining the MAC for each unit and a separate program to demonstrate the technique. I modified the demo code to just repeatedly send a set of fake sensor values back and forth to demonstrate to my satisfaction that this technique would work for my intended application. Here’s the code:

And here’s some typical output from the two ESP32-CAM units:

From one device:

From the other device:

A couple of ‘user notes’ about this demo program and it’s application to two different devices:

  • The MAC address display program has to be run twice – once for each unit to get that all-important information.
  • The demo program also has to be run twice, but the MAC address used for each device is the address for the ‘other’ device.
  • As can be seen from the output, I simply used fake sensor data. However, I made sure to use different sets of values (10,20,30 on one and 20,40,60 on the other) so I could verify that the data was actually getting from one to the other.
  • The user must be careful to make sure the two devices are programmed correctly. I found it really easy to program the same device twice – once with the MAC & data for the other unit, and again with the MAC and data for the unit being programmed (which will not work). I wound up with clip-on labels on the two cables going to the two different devices, and then making sure the Visual Studio programming port was correct for the device I was programming. Doable, but not trivial.

21 March 2025 Update:

I broke a finger playing b-ball two days ago, so my typing speed and accuracy have suffered terribly; such is life I guess.

Since my last update I designed and printed a fixture to hold two ESP-CAM modules and a laser diode so I could run some distance experiments. Here’s a photo of the setup:

10 to 80cm distance setup. Note I’m using only one ESP-CAM module

I modified the firmware to simply print out the max value in the scene, along with the row/col coordinates for the max value. The firmware continues to save a red-only image as well. Here’s the hand-written results:

the numbers at the end of each measurement are the .bmp file suffixes (from picture_red58.bmp to picture_red87.bmp).

And here are the representative red-only photos (one per distance) for the selected measurement:

10cm: 114 @ (40,65) picture58_red.jpg
20cm: 241 @ (72,7) picture62_red.jpg
30cm: 215 @ (66,23) picture64_red.jpg
40cm: 215 @ (65,31) picture68_red.jpg
50cm: 225 @ (57,16) picture74_red.jpg
60cm: 199 @ (64,40) picture79_red.jpg
70cm: 255 @ (33,49) picture84_red.jpg
80cm: 255 @ (36,68) picture85_red.jpg

From the data and the photos, it is easy to see that the laser ‘dot’ doesn’t come into the view of the camera until the 20cm distance, and after 60cm the ‘dot’ is washed out by the normal overhead lighting. In between (20 – 60cm) the ‘dot’ can be seen to progress from the far left-hand edge of the scene toward the middle.

26 March 2025 Update:

I made another run, this time with two cameras, as shown in the following photos:

two ESP32-CAM modules mounted on the same frame, with red dot lase mounted on centerline

If my theory is correct, I should be able to see the location of the red dot move horizontally across the images, from left to right for the left cam, and right to left on the right cam. Unfortunately this wasn’t evident in the data. I loaded the above data into Excel and plotted it in various ways. The best I could come up with was to plot row & col locations from each camera vs distance, hoping to see a linear change in either the row or column values. The plots are shown below:

From the above plots, I could see no real progession in the row values, but if I used a lot of imagination I could sort of see a linear decrease in the column values for the left camera and a much less distinct linear increase in the column values for the right camera.

For completeness, I have included the actual camera images used to produce the above data:

Looking at all the above images, I can’t discern *any* real horizontal shift in the position of the red dot. In addition, at 70cm, the reflection of the laser dot off the table surface is just as bright as the reflection off the target, leading to frequent mis-identification of the maximum location.

Conclusion:

Well, this was a nice try and a fun project, but there’s no escaping the conclusion that this ain’t gonna work!

Leave a Reply

Your email address will not be published. Required fields are marked *