Author Archives: paynterf

IR Modulation Processing Algorithm Development – Part VIII

Posted 18 June 2017

In my last post on this subject, I showed how I could speed up ADC cycles for the Teensy 3.5 SBC, ending up with a configuration that took only about 5μSec/analog read.  This in turn gave me some confidence that I could implement a full four-sensor digital BPF running at 20 samples/cycle at 520Hz without running out of time.

So, I decided to code this up in an Arduino sketch and see if my confidence was warranted.  The general algorithm for one sensor channel is as follows:

  1. Collect a 1/4 cycle group of samples, and add them all to form a ‘sample_group’
  2. For each sample_group, form I & Q components by multiplying the single sample_group by the appropriate sign for that position in the cycle.  The sign sequence for I is (+,+,-,-), and for Q it is (-,+,+,-) .
  3. Perform steps 1 & 2 above 4 times to collect an entire cycle’s worth of samples.  As each I/Q sample_group component is generated, add it to a ‘cycle_group_sum’ – one for the I and one for the Q component.
  4. When a new set of cycle_group_sums (one for I, one for Q) is completed, use it to update a set of two N-element running sums (one for I, one for Q).
  5. Add the absolute values of the I & Q running sums to form the final demodulated signal value for the sensor channel.

To generalize the above algorithm for K sensor channels, the ‘sample_group’ and ‘cycle_group_sum’ variables become K-element arrays, and each step becomes a K-step loop. The N-element running sum arrays (circular buffers) become [K][M] arrays, i.e. two M-element array for each sensor (one for I, one for Q).

All of the above sampling, summing, and circular buffer management must take place within the ~96μSec ‘window’ between samples, but not all steps have to be performed each time.  A new sample for each sensor channel is acquired at each point, but sample groups are converted to cycle group sums only once every 5 passes, and  the running sum and final values are only updated every 20 passes.

I built up the algorithm in VS2017 and put in some print statements to show how the gears are turning.  In addition, I added code to set a digital output HIGH at the start of each sample window, and LOW when all processing for that pass was finished.  The idea is that if the HIGH portion of the pulse is less than the available window time, all is well. When I ran this code on my Teensy 3.5, I got the following print output (truncated for brevity)

And the digital output pulse on the scope is shown in the following photo

Timing pulse for BPF algorithm run, shown at 10uS/cm. Note the time between rising edges is almost exactly 96uSec, and there is well over 60uSec ‘free time’ between the end of processing and the start of the next acquisition window.

As can be seen in the above photo, there appears to be plenty of time (over 60μSec) remaining between the end of processing for one acquisition cycle, and the start of the next acquisition window.  Also, note the fainter ‘fill-in’ section over the LOW part of the digital output.  I believe this shows that not all acquisition cycles take the same amount of processing time.  Four acquisition cycles out of every 5 require much less processing, as all that happens is the individual samples are grouped into a ‘sample_group’.  So the faint ‘fill-in’ section probably shows the additional time required for the processing that occurs after collection/summation of each ‘sample_group’.

The code for these measurements is included below:

More to come,

Frank

 

IR Modulation Processing Algorithm Development – Part VII

Posted 17 June 2017

In my previous post on this subject, I discussed my decision to change from an Arduino Uno SBC to a Teensy 3.5 for implementing the  ‘degenerate N-path’ digital band-pass filter (BPF) originally introduced to me by my old friend and mentor John Jenkins.  After replacing the Uno with the Teensy and getting everything running  (which took some doing, mostly due to my own ignorance/inability), it was time to see if the change would pay off in actual operation.

In my initial perusal of the available documentation for the Teensy 3.x SBC (have I told you lately how much I love the widespread availability of information on the  inet?), I ran across some new programming features that aren’t available in the rest of the Arduino world.  The Teensy 3.x supports two independent 32-bit timers, supported by two new libraries (TimerOne and TimerThree).  When I first looked at this new functionality, I thought – “wow – this is just what I need to implement the sampling front-end portion of the digital BPF – I can use it with an appropriate ISR to get accurate sample timing!”.   And then I ran across Paul’s ‘Delay and Timing‘ page with it’s description of the new ‘elapsedMillis’ and ‘elapsedMicros’ functions; These functions allow for accurate periodic execution of code blocks inside the normal ‘loop()’ function, without having to deal with interrupts and ISRs – cool!  And then I ran across the ‘FrequencyTimer2’ library written by Jim Studt….

So now I found myself going from no real good options for accurate sample timing to a ‘veritable plethora’ of options, all of which looked pretty awesome – what’s a guy to do?  Since the ‘elapsedMicros’ option looked like the simplest one to implement, I decided to try it first.

elapsedMicros:

From previous work I have a Trinket SBC transmitting an IR beam modulated by a square-wave at approximately 520Hz.  The plan is to sample this waveform 20 times per cycle, and to have the sampling frequency as close as possible to 20×520 = 10.4Ksamples/sec, or approximately 96μS/sample.

I created a small test program to explore the feasibility of using the ‘elapsedMicros()’ function for IR detector sensor sampling.

 

In the above program, I simply generate a 10μS pulse every 95.7μS.  The ‘95.7’ value was empirically determined by watching the transmitted  IR waveform and the 10μS pulses together on a scope, and adjusting the value until the difference between the two frequencies was as small as possible (i.e. when the movement of the transmit waveform compared to the pulse train was as slow as possible), as shown in the short video below:

 

In the above video, the lower trace is the generated pulse train, and the upper trace is the transmitted IR modulation waveform.  The scope trigger was set to the pulse train, with the modulation waveform free to slide left or right based on the ‘beat frequency’ between the two waveforms.

Next, I added code to save ADC samples to an array for later printout.  Now that I am no longer constrained by the minuscule amount of RAM available on the Uno, I opened up the array size to 2000 elements to allow more viewing time before the program was interrupted by the serial output delays.  The code for this and the resulting Excel plot are shown below:

The resulting 2000 element array was dropped into Excel and plotted, as shown below:

All 2000 samples from the test program

First 40 samples. Note that 40 samples covers exactly two cycles of the modulation waveform

So, it looks like the ‘elapsedMicros()’ function is doing exactly what I want it to do – sampling the input waveform at almost exactly 20 samp/sec without me having to figure out the exact delay time needed.

The next step was to determine how much ‘free time’ is left over for other processing steps like sampling multiple sensor channels, doing the ‘sample’ and ‘cycle’ sums, etc.  For this step, I removed the array loading section and replaced it with a call to ‘delayMicros()’.  Then I manually adjusted the delay value until the period of pulse train started expanding away  from the desired 95.7μS value.  The result was that a delay value of 85μS did not change the pulse period, but a value of 90μS did (slightly).  So, I have between 85 and 90μS of ‘free time’ available (out of a total of 96!!!)  for other processing chores.  Adding a single call to ‘analogRead(IRDET_PIN)’ reduced the available ‘free time’ by about 15μS, from between 85 & 90 to between 70 & 75μS.  This shows that the time for a single analog read is about 15μS, which may be due to the same pre-scaling issue as I saw on the Uno (to be determined).  In any case, even if I utilize 4 sensor channels, I should be have about 25μS left over for the summation and array load operations.

To investigate the analogRead() timing issues, I set up a small program to measure the time required to read a pin 1000 times.  Here’s the code:

With the above code, and with all default settings, the time required for 1000 reads was 17mSec, so about 17μS, which tracks well with  the above measurements.

After changing the conversion speed to ADC_CONVERSION_SPEED::HIGH_SPEED, the time required for 1000 measurements was reduced to 11mS, so about 11μS per read.

I ran a whole series of test with the different Teensy ADC library settings, with the following results.  All times are in microseconds, and are the average of 1000 iterations

  • conversion and sampling speed set to “HIGH”: 10.997
  • all adjustments commented out: 17.281
  • just conversion speed set to “HIGH”: 11.014
  • just sampling speed set to “HIGH”: 15.190
  • just resolution changed to 12 bits: 17.276
  • just resolution changed to 8 bits: 17.242
  • HIGH conversion and sampling speeds, and with 8-bit res: 8.931
  • HIGH conversion and sampling speeds, and with 12-bit res: 10.998
  • All of the above, plus averaging set to 1: 4.758

So, I can get the ADC time down to about 5uS/sensor, which means that even with four sensor channels being monitored, I will have over 70uSec for ‘other stuff’, which should be more than enough to get everything done.

Frank

 

IR Modulation Processing Algorithm Development – Part VI

Posted 14 June 2017

In my previous posts on this subject, I have been working with an Arduino Uno as the demodulator processor, but I have been plagued by its limitation of 2KB for program memory. This has caused severe limitations with timing debug, as I can’t make debug arrays long enough for decent time averaging, and I can’t do more than one sensor channel at a time.

So, I finally took the plunge and acquired some of Paul J Stoffregen’s Teensy 3.5 processors from their store.  From their site: “Version 3.5 features a 32 bit 120 MHz ARM Cortex-M4 processor with floating point unit. All digital pins are 5 volt tolerant.” The tech specs are shown on this page, but the main features I was interested in are:

  • 120MHz processor speed vs 16MHz for the Uno
  • 192KB RAM vs 2KB for the Uno
  • Analog input has 13 bit resolution vs 12 for the Uno
  • As an added bonus, the Cortex-M4 has an FPU, so integer-only math may be unnecessary.
  • Much smaller physical footprint – the Teensy 3.5 is about 1/4 the area of the Uno
  • Lower power consumption – The Teensy 3.5 at 120MHz consumes about 30mA at 5V vs about 45mA at 5V for the Uno.

Here are some photos of the Teensy 3.5 as installed on my algorithm test bed, and also on my Wall-E2 robot where it might be installed:

Teensy 3.5 installed on my algorithm test bed, with the Uno shown for size comparison. The small processor in the foreground is an Adafruit ‘Trinket’

Side-by-side comparison of the Uno and Teensy 3.5 SBC’s

Closeup of the Teensy 3.5 shown atop the ‘sunshade’ surrounding the IR sensors.  this is a possible installed location

Wider view of a Teensy 3.5 placed atop the ‘sunshade’ surrounding the IR sensors

In addition to all these goodies, the folks at Visual Micro added the Teensy line to their Microsoft Visual Studio add-on, so programming a Teensy 3.5 is just as easy as programming a Uno – YAY!

Of course, I’ll need to re-run all the timing tests I did before, but being able to create and load (almost) arbitrary-length sample capture arrays for debugging purposes will be a great help, not to mention the ability to use floating-point calculations for better accuracy.

Stay tuned,

Frank

 

 

IR Modulation Processing Algorithm Development – Part V

Posted  09 June, 2017

In getting the Arduino code working on my Uno/Trinket test setup (shown below), I have been having some trouble getting the delays right.  It finally occurred to me that I should run some basic timing experiments, so here goes:

Sample Group Acquisition Loop:

this is the loop that acquires analog samples from the IR detector, and sums 1/4 cycle’s worth into a single ‘sample group’.  To measure this time, I ran the following code:
int startusec = micros();
int sum = 0;
for (int i = 0; i < 1000; i++)
{
int samp = analogRead(SQWAVE_INPUT_PIN1);
sum += samp;
}
int endusec = micros();
Serial.print("time required for 1000 analog read/sum cycles = "); Serial.println(endusec - startusec);

The time required for 1000 cycles was 15064 uSec, meaning that one pass through the loop takes an average of just over 15 uSec. Adding a 85 uSec delay to the loop should result in a loop time of exactly 100 uSec, and a 1000 pass loop time of 100,000 uSec or 0.1sec.  The actual result was 99504, or about 99.5 uSec/cycle – pretty close!

Next, I replaced the summation with a write to a 500-element array (couldn’t do 1000 and still fit within the Uno’s 2K memory limit), and verified that this did not materially change the loop timing.  The time required for 500 loops was 49788; twice that time would be 99576, or almost exactly the same as the 99504 time for the summation version.

Then I tweaked the delay to achieve as close to 25 complete cycles as possible, as shown in the Excel plot below.  With an 82uSec loop delay, the total time for 500 loop iterations was 48272, or about 96.544 uSec per loop iteration.

96.544 uSec per loop iteration, and 20 loop iterations per cycle gives 20*96.544 = 1930.88 uSec per cycle or 518 Hz.  This is very close to the 525Hz value I got from my O’scope frequency readout when I first fabricated my little test setup.

Next, I coded 500 iterations of a two-detector capture/sum operation, and got: “time required for 2-detector 500 analog read/store cycles = 15520”.  So,  about 31 uSec/iteration, or almost exactly twice the one-detector setup.  A four-detector setup yielded a time of 30352 uSec for 500 iterations, or about 60.15 uSec/iteration.  So, a 4-detector setup is possible, assuming the Uno 2KB memory constraint issue can be addressed successfully.

In summary:

  • It takes about 15 uSec to read each sensor’s A/D value and either sum it or store it in an array
  • A four-sensor setup can probably be accommodated, but only if the required summing arrays fit into available memory (not possible for Uno, but maybe for others.
  • A loop delay value of 82 uSec results in almost exactly 20 samples/cycle.

Stay tuned

Frank

 

 

 

IR Modulation Processing Algorithm Development – Part IV

Posted 07 June 2017

I seem to have grabbed a tiger by the tail in my continued collaboration with my old friend and mentor John Jenkins on this project to extract the estimated magnitude of a square-wave modulated IR signal in the presence of ‘flooding’ from ambient light sources.  The whole thing started out innocently enough when John was at our house a few weeks ago and I showed him my spiffy wall-following autonomous robot Wall-E2.  I mentioned at the time that I was having some trouble getting Wall-E2 to successfully home in on an IR beam to mate with a charging station, and he suggested that a square-wave modulated signal might do the trick, as it would allow me (and Wall-E2) to discriminate between the IR beam and the other interfering signals.

I should have known right then that I was in trouble, as John had that look in his eyes – the one that says “Hmm, that’s an interesting problem…..”.  I have seen that look any number of times over the 40 years or so that I have known him, and it always results in me tearing up my previous work (and my previous assumptions) and starting over again.  The only saving graces in all this are a) It’s a lot of fun when it happens, b) I’m retired now so I don’t care how long it takes or how much work is involved, and c) I’m a masochist at heart! ;-).

So, here we are.  John and I have been exchanging emails over the last week or so, discussing the ‘best’ way to solve this problem.  One of the first things that happened was John started blabbering about ‘Degenerate N-path Bandpass Filters’, and I had no idea what he was talking about (I hate it when that happens!).  Of course I couldn’t tell John that I was completely ignorant, so I made the appropriate noises and raced to educate myself before he discovered my ignorance.  I found a neat video that explained the technique in a way that even I could understand. The video focused on implementing the filter technique in CMOS hardware, but the general technique is applicable to the Arduino world as well, as long as the modulation frequency is low enough to accommodate the lower clock speeds.

John came up with a brilliant graphical illustration of the algorithm used for implementing a N-path band-pass filter.  Even more amazing, he did it in Excel, so the whole damned thing is live – wow!   As shown in the ‘dead’ version below, there are two ‘channels’ – an in-phase (I) and quadrature (Q) ‘channel’.  Both channels start with a group of samples spanning exactly 1/4 cycle that are summed together to form a ‘sample group’.  In the I channel, each group is given a sign in the sequence ‘+’, ‘+’, ‘-‘, ‘-‘, and in the Q channel this same group is given a sign in the sequence ‘-‘, ‘+’, ‘+’, ‘-‘.  The four groups in each channel are summed to form the I & Q ‘cycle sums’, and each such sum is added to a N-element circular buffer (one each for the I & Q channels).  The running sum of all elements in each circular buffer are the band-pass filtered I & Q components of the input signal.  The final result is formed by adding the absolute values of the I & Q running sums.

Page 1 of the above document shows the sign sequence for the in-phase and quadrature ‘channels’.  Note that the input to both channels is the same ‘sum-of-5-samples’ group, but the sign changes in a different sequence for the I & Q ‘channels’.  Each 1/4 cycle of the input signal is treated in the same fashion.  The horizontal time scale is specific for the planned 500Hz modulation signal.

Page 2 depicts the real heart of the algorithm, as it shows how each ‘sample sum’ (1/4-cycle sum-of-samples) feeds into the final circular summing buffer.  Four ‘sample sum’ groups (comprising one signal cycle) are summed into a ‘cycle sum’ for both the I & Q channels, and each such ‘cycle sum’ is loaded into the corresponding I or Q circular summing buffer.  The latest ‘cycle sum’ element replaces the oldest element, and therefore the  circular summing buffer for each channel represents a N-element  band-pass filter, where N is the length of the circular summing buffer.  This diagram has a lot of information in it, so it can take some time to get comfortable with it (it did for me, anyway).

Some other random details:

  • This diagram is set up for a 64-element circular summing buffer, but shorter or longer is OK too.  The band-pass filter bandwidth is inversely proportional to the buffer length.
  • The times noted in the labels at the top are for a presumed 500Hz modulation, with a period of 2000uSec.
  • The notation m = n/5 comes from the fact that (in this particular implementation) there are 20 samples/cycle, which means that each 1/4 cycle group of contains 5 samples.  For 64 elements, each comprising 1 entire cycle of samples, there are 64X20 = 1280 samples. So in this implementation, m/4 = 64 ==> m = 256 = n/5 ==> n = 1280 total samples represented in the circular buffer at any one time.

Page 3 shows the contents of one of the two circular buffers, after each complete ‘rotation’ of the buffer.  After 64 complete cycles the buffer is full, as shown in the ’00’ column.  64 cycles later, the buffer contents are as shown in the ’01’ column, and so on.  The text below shows the procedure for start and running the buffer.  Note in this text that “next cycle value” and “current cycle value” are the same thing, and the “input pointer” and “output pointer” variables are incremented MOD N, (N = 64 here).

Pages 4-12 show the input signal, the I/Q channel values, and the I & Q circular buffer running sums for different phase offsets between the transmitted and received signals (transmit & receive frequencies are the same in all cases).  Page 4, for instance, shows the situation for the receiver perfectly in phase with the transmitted signal.  If we look at the I & Q summed values at integral cycle boundaries (-1, 0, +1, etc) we see that the I signal is at 128, and the Q signal is at 0, giving abs(I) + abs(Q) = 128 + 0 = 128.  If the same calculation is performed for all the other phase relationships (i.e. pages 5-12) at the same points, the answer will always be the same, i.e. 128.  This shows that the band-pass filter implementation works as intended, even without any phase-locking requirement.  This treatment assumes that the Tx & Rx clocks are identical, so that the 1/4-cycle sample groups span exactly 1/4 cycle.  Any difference between the two clocks will show up as ripple on the results, proportional to the difference between the two frequencies.  This error term should not be significant for the typical single-board-computer crystal-controlled clocks.

I sent this post off to John for approval, and got the following email back, with some additional clarifying comments:

  • n-path filter is degenerate (my term) because it only uses two paths — I and Q — vice many as shown in video (very good video btw,learn a lot from it too)
  • possible topics for more detail related to last paragraph (but beware that perfect is enemy of good enough or something like that):
    • output ripple
      • frequency ==> difference between tx and rx freqs should be very small
        • only spec found was +/-50ppm, so 100ppm delta worst-case w/o aging, etc
        • 100ppm at 500Hz ==> 0.05Hz
        • believe ripple will be twice difference in freq because of this being a fullwave synchronous rectifier (not verified), so 0.1Hz
      • magnitude ==> variations due to sample(s) from one 1/4-cycle group getting into an adjacent 1/4-cycle group as tx and rx phases slide past each other at <<0.1cycle/sec
        • non-50/50 duty cycle is most likely cause
        • with 5-samples per 1/4-cycle group normalized ripple could vary from 1.0 to 0.8;
          • not a problem for the relative outputs of two sensors for steering
          • could be important in cases where absolute magnitude is needed because bpf (eg, 10Hz) would not average signal for long enough to average out ripple this slow ((eg, 1 cycle every 10sec)
          • increasing # of samples per 1/4-cycle group will reduce this effect
    • no antialias filter is used so # of samples per 1/4-cycle group needs to be significantly higher than nyquist requirement for 2/bw to avoid issues (unless clock rate is servo’d to correct freq by monitoring and setting Q value to zero)

More to come – stay tuned!

Frank

 

 

 

 

IR Modulation Processing Algorithm Development – Part III

Posted 27 May 2017

In my previous post I demonstrated an algorithm for processing a modulated IR signal to extract an intensity value, but the algorithm takes too long (at least on an Arduino Uno) to allow for 20 samples/cycle (admittedly way over the required Nyquist rate, but…).  So I decided to explore ways of speeding up the algorithm.

First, the baseline:  The starting point is the 17,384 μSec required to process 100 samples in the current algorithm, or 174 μSec/sample.  At an input frequency of 520Hz, 20 samples/cycle is about 96 μSec/sample, so I’m off by a factor of 2 or so.  And this is only for one channel, so I’m really off by a factor of 4 (for a 2-channel setup) or 8 (for my current 4-channel arrangement)

As an experiment, I reduced the running average length from 5 to 1 cycles, or from 100 to 20 samples.  This reduces the shifting operation load by a factor of 5, and resulted in a total processing time of 1876 μSec for all 100 samples – wow!

Then I discovered I had failed to uncomment the line that loads the new running average value into the front of the running average array, so I put that back in and re-ran the measurement.  This time the number came up as 10748 μSec!  This is just not possible!  It is impossible that 10,000 (100 iterations/sample, 100 samples) iterations of a copy operation from one location in the array to another one takes 1/10 the time as 100 iterations (1/sample) of a copy operation from a variable into the array – not possible!!!

But, since it was happening anyway – whether possible or not, I decided I was going to have to figure it out :-(.  So, I changed the line

RunningAvg1[0] = (int)chan1Avg;

to

RunningAvg1[0] = 0;

and re-ran the measurement.  This time the total for processing 100 samples was 1896 μSec – much more believable!  So, what’s the difference between these two operations?  The only thing I could think of is that it must take a lot of time to convert a double to an int.

So, I ran a test where I executed the ‘RunningAvg1[0] = (int)chan1Avg;’ line 10 times, all by itself, and measured the elapsed time.  I got 72 μSec – a much more believable number, but not what I was expecting.  Increasing the number of iterations to 100 resulted in an elapsed time of 672 μSec – consistent with 72 μSec for 10 iterations.  That’s nice, but I’m still not any closer to figuring out what’s going on.

Well, after a bunch more experiments, I think I have the problem narrowed down to the use of floating point math on a few operations.  I have seen some posts to the effect that floating point math is much slower than integer math on Arduino processors, and these experiments tend to bear that out.  I should be OK with integer math everywhere, I hope ;-).

After completely re-writing the algorithm to eliminate floating point math (and correcting several logic errors – oops!), I re-ran the 100-element process for 1 channel, with the following results:

All components – original captured samples, running average, AC component, and full-wave rectified component. Note elapsed time of 3008 uSec

From the above Excel plot, it is clear that the algorithm successfully extracted the full-wave rectified value for the incoming modulated IR signal, and did so in only 3008 uSec for 100 samples.  This should mean that I can easily handle up to three simultaneous channels, and maybe even four – YAY!

Another run with two simultaneous channels was made.  The following Excel plot shows the Channel 2 results, along with the elapsed time for both channels.

Channel 2 all components – original captured samples, running average, AC component, and full-wave rectified component. Note elapsed time of 4268 uSec

The above results for two channels strongly suggests that all four channels in the current hardware implementation can be processed simultaneously while still maintaining a 20 sample/cycle sample rate.  This is extremely good news, as it implies that I can ‘simply’ insert an Arduino Uno or equivalent between the detector array and the robot controller.  The robot contoller will continue to see left/right analog values as before (but inverted – more positive is more signal), but background IR interference will be averaged out by the intermediate processor – cool!

Rather than use a Uno, which is physically very large, I hope to be able to use something like an Adafruit Arduino Pro Micro, as shown below:

Adafruit’s Arduino Pro Micro. 16MHz, 9 Analog 12 Digital I/O

This should fit just about anywhere (probably on top of the sunshade), and be very easy to integrate into the system – we’ll see.

Stay tuned!

Frank

 

 

 

IR Modulation Processing Algorithm Development – Part II

Posted 25 May 2017

One of the things I didn’t understand about the analog sample runs from my previous post was why there were so many cycles of the IR modulation signal in the capture record; I had set the algorithm up to capture only 5 cycles, and there were more than 10 in the record – what gives?

Well, after a bit of on-line sleuthing, I discovered the reason was that the A/D conversion process associated with the analogRead() function takes a LOT longer than a digitalRead() operation.   This put a severe dent in my aspirations for real-time processing of the modulated IR signal, as I would have to do this for at least two, and maybe four independent signal  streams, in real time – oops!

One thing I have discovered for sure in the modern internet era; if you are having a problem with something, it is a certainty (i.e. Prob = 100%) that many others in the universe have had the same problem, and most likely someone has come up with (and posted about) one or more solutions.  So, I googled ‘Arduino Uno faster analogRead()’, and got the following hits:

The very first link above took me to this forum post, and thanks to jmknapp and oracle, I found the Arduino code to reset the ADC clock prescale factor from 128 to 16, thereby decreasing the conversion time by a factor of 8, with no reduction in ADC resolution – neat!

To test the effect of the prescaler adjustment, I measured the time it took for 100 ADC measurements with no delay between measurements.  As shown below, there is a dramatic difference in the ‘before’ and ‘after’ plots:

 

100 ADC cycles with no delay, prescale = 128

100 ADC cycles with no delay, prescale = 16

Next, I adjusted the delay between ADC cycles to collect approximately 5 cycles at the 520Hz input rate, as shown below:

Delay adjusted so that 100 samples ~ 5 cycles at 520Hz.

With the prescaler set to 16, the ADC is much faster.  With a 5-cycle collection window at 520Hz, I have 80 uSec/cycle to play with for other purposes, so it seems reasonable that I can handle multiple input streams with relative ease – YAY!!.

The next step was to simulate a 4-channel capture operation by capturing 400 samples, 100 each from four different channels. In this simulation, all the data comes from the same IR link, but the processing load and timing is the same.  All the samples from the same time slot are taken within a few microseconds of each other, and the loop (inter-sample) delay was adjusted such that approximately five cycles were captured from each ‘channel’, as shown in the following Excel plot

Simulated 4-channel capture

As can be seen in the above plot, the channel plots overlap almost exactly.  What this shows is that the Arduino Uno can capture all four IR detector channels at sufficient time resolution (about 20 samples/cycle) for effective IR signal detection/evaluation, and with sufficient time left over (about 30 uSec) for some additional processing.

If the design is changed from four channels to just two, then the processing load goes down significantly,  as shown in the following plot

Simulated 2-channel capture

To complete the simulation, I added the code to perform the following operations on a sample-by-sample basis:

  • Update the running average of the sample array
  • Subtract the running average from the sample, and take the absolute value of the remainder (full-wave rectification)
  • Store the result in another array so it can be plotted. This last step isn’t necessary except for debugging/evaluation purposes

Initial results as shown below are very promising. The following Excel plots show the results of processing 100 ADC samples in real time.  First 100 samples were loaded into an array to represent the last 100 samples in a real-time scenario, and the running average value was initialized to the average of all these samples.  Then each subsequent real-time sample was processed using the above algorithm and the results were placed in holding arrays for later printout, with the following results

All components – original captured samples, running average, AC component, and full-wave rectified component

Detail view of original captured samples and the running average component

Detail view of the AC component of the original captured samples and the computed full-wave rectifed component

The above plots confirm that the ADC samples can indeed be processed to yield the full-wave rectified intensity of a modulated IR beam.  However, there is a fly in the ointment – it takes too long; it took 17,384 μSec to process 100 samples – but 100 samples at 20 samples/cycle only takes approximately 9600 μSec – and this is only for one channel :-(.  I will need to find some serious speedup tricks, or reduce the number of samples/cycle, or both in order to fit the processing steps into the time available.

Stay tuned,

Frank

 

 

 

 

 

 

IR Modulation Processing Algorithm Development – Part I

Posted 23 May 2017

As you may recall from previous posts, I have been collaborating with my long-time friend and mentor John Jenkins on the idea to use square-wave modulation of the charging station IR beam to suppress ‘flooding’ from ambient IR sources such as overhead incandescent lighting and/or sunlight.

More than likely, If it is possible to implement a software processing algorithms to recover steering information from potentially corrupted data, it will have to be housed on a dedicated processor.  So, I decided to set up a separate test setup using two processors – one to generate a square-wave modulation waveform, and another to receive that waveform through an IR link.  The link can then be modified in a controlled way to simulate link losses and/or ‘flooding’.  The initial hardware setup is shown below.

Initial test bed verification using 12cm separation

Scope shot showing transmitted and received waveforms, 12cm separation

Algorithm test bed with IR link range set to 78cm

Then I ran my little test program on the receiver processor that simply acquires 100 samples at roughly 20 samples/cycle and then prints out the results.  The following two images are Excel plots of the results for 12cm & 78cm separation.

As can be seen from these two plots, the 12 & 78cm separation values provide a reasonably good simulation of the ‘very good’ and ‘reasonably crappy’ signal conditions.

Next I verified that I can successfully ‘flood’ the receiver with my portable battery-operated IR signal generator.  I monitored the transmitted and received waveforms, without and then with flooding.  In both cases, the bottom trace is the 5V square-wave transmitted signal, shown at 2V/div, and the top trace is the received signal shown at 1V/div.  The ground for both traces is the same line on the scope screen.

78cm separation, no flooding signal. Bottom trace is transmit @ 2V/cm, top is receive @ 1V/cm, ground for both is same line

78cm separation, with flooding signal. Bottom trace is transmit @ 2V/cm, top is receive @ 1V/cm, ground for both is same line

Applying flooding signal with battery-operated IR signal generator

As can be seen in the scope photos, I can indeed produce almost 2V of ‘flooding’ using the IR signal generator, so I should be able to determine whether or not a particular recovery algorithm is successful at suppressing flooding effects.

Stay tuned

Frank

 

 

 

 

How to update the ACP Pyramid

The first thing to understand about updating the Pyramid is that you don’t have to unless there is something wrong with your current Pyramid version (i.e. there is a
a bug or problem) that has been addressed in the latest version available on the ACP website).  As each new version is placed on the ACP website, a ‘change log’ entry is also uploaded, so you can tell what changed from your current version to the one that is on the website. If none of the listed changes affect your practice, then there is no real need to update.

the second thing to understand about updates is that, even if you download the latest version from the ACP site, there is no need to immediately update all your client Pyramid files – it is completely reasonable to update client Pyramids to the latest version only when you next deal with a particular client file, and this is done by simply importing the client’s Pyramid data into the new (blank) version using the ‘Import from File…’ button on the Inventory page (don’t forget to use ‘Save As…’ to save the updated client file back to the client folder while leaving the new version file unchanged).  Here’s a video tutorial showing the update procedure

Charging Station System Integration – New Sunshade Testing

Posted 19 May 2017

While working with John Jenkins on the modulated IR beam idea, I decided to run some tests with the current 4-detector design, to see how the new sunshade with center divider was affected by ambient IR on a bright sunny day.  So, I disabled Wall-E2’s motors and then placed it at several different critical spots in the entry hallway.  At each location I used my little IR test generator to mark the beginning and the end of the test for that location, and then moved on to the  next one.  The locations are shown in the following photos, in the order that the tests were run.

Position 1: Near where Wall-E2 transitions from wall-tracking to IR beam homing

Position 2: This is where Wall-E2 has been winding up when it homes on the outside sunlight instead of the IR beam

Position 3: Here Wall-E2 should be firmly fixated on the IR beam

These results, combined with my earlier IR response tests with a single phototransistor are encouraging, because it is clear that at least in this case, Wall-E2 should have no difficulty discriminating between the ambient IR and the charging station IR beam.

I ran some homing tests, which Wall-E2 handled with ease; unfortunately God had already turned the lights out on this side of the world, so the tests weren’t in the presence of daylight IR interference.  I’ll do some more real-world discrimination testing tomorrow, and I am hopeful that the new sunshade-with-divider version will be successful, at least for this part of the house.

Stay tuned,

Frank