IR Modulation Processing Algorithm Development – Part XII

Posted 01 July 2017

I went down a bit of a rabbit-hole between my last post on this subject and today.  I was attempting to run down the problems with my IR demodulation code when I discovered that the basic rate at which the demodulator captured samples was off by a factor of 5 or so – yikes!!  Instead of 20 samples per cycle, I was seeing more like 100, as shown below.

Raw data capture using ‘sinceLastOutput = 0’ in the first demodulation step

Raw data capture using ‘sinceLastOutput -= USEC_PER_SAMPLE’ in the first demodulation step

As you might expect, this development threw me for a bit of a loop – as the change from ‘sinceLastOutput = 0’ to ‘sinceLastOutput -= USEC_PER_SAMPLE’ was instrumental (or at least so I thought) to getting the transmit and receive frequencies matched more accurately.  So, now I had to go chase down yet another tangential problem – what I refer to as ‘going down a rabbit hole’. The only saving grace in all this is that, as a twice-retired engineer, I have no deadlines! 😉

To resolve this problem, I wound up creating an entirely new test program to isolate the issue, with just the following lines in the loop() function:

With the ‘sinceLastOutput -= USEC_PER_SAMPLE’ line active, I got about 100 samples/cycle. With this line commented out and the ‘sinceLastOutput = 0’ line active, I got the normal 20 samples/cycle, with nothing else being changed.

Once I was sure the test program was consistently producing the anomalous results I had noticed in the complete program, I posted this issue to the PJRC Forum, so I could get some help from the experts.  I knew it was something I was doing wrong – I just didn’t know what!

Within a few hours I had received several responses, and the one that hit the bullseye was the one correctly identifying a subtlety with the the ‘-=’ elapsedMicros() usage format.  When this format is used, the accompanying ‘elapsedMicros’ variable must be initialized in the setup() code; otherwise it will be some arbitrary (and possibly quite large) value when the loop() function is first entered. This will cause the ‘if’ statement to trigger repeatedly until the ‘-=’ line eventually reduces the variable to a value below USEC_PER_SAMPLE, at which point it will start behaving as expected.  This odd behavior never happens with the ‘=0’ usage, as the variable is initialized on the first pass through the ‘if’ statement.  Sure enough, when I added a line at the bottom of my setup() function to set the ‘sinceLastOutput’ variable to zero, my little test program immediately stopped mis-behaving.

Well, this little side-journey only cost me a couple of days, and a few more white hairs (oh, wait –  my hair is already completely white – no problem!)  Back to my regularly scheduled program…

Frequency Matching:

My friend and mentor John Jenkins, who has been looking over my shoulder (and occasionally whacking me on the head) during this project, was unsure that my frequency matching setup was actually 100% complete, as the video I took the last time didn’t run long enough to convince him (and there were some un-explained triggering glitches as well).  So, I thought I would re-do this part to make him happy.  To do this I modified my little test program from the above ‘rabbit-hole’ elapsedMicros issue to output a square wave from the demodulator board that could be compared to the transmitter square wave.

As shown in the above video, the transmit and demodulator frequencies are quite well matched, showing essentially zero relative drift even over the 30-40 second time of the video.  Mission accomplished! ;-).

Sample Acquisition Step:

I modified my demodulator program to properly initialize the ‘elapsedMicros’ variable being used for sample timing, and verified proper operation by commenting out everything but the sample acquisition step.  I captured several hundred samples, and plotted the first hundred in Excel as follows:

100 samples using proper ‘elapsedMicros’ variable initialization

Sample Sum Step:

Next, each group of five samples (1/4 cycle) is summed, and the ‘In-phase’ and ‘Quadrature’ components are generated using the appropriate sign sequences.  As shown in the following, this appears to be happening correctly:

Cycle Sum Accumulation:

As each sample group is summed and the I/Q components generated, accumulate the 1/4-cycle I/Q sums into a ‘Cycle Sum’.  As the following printout shows, this step also is being performed properly


Running Sum Accumulation:

The last step in the algorithm is to compute the N-cycle running sum of all the cycle sums.  This is done by subtracting the oldest value from the circular buffer from the current running sum, adding the current cycle sum to the running sum, and then replacing the oldest cycle sum value in the circular buffer by the current cycle sum.

  1. RunningSum = RunningSum + CurrentCycleSum – OldestCycleSum
  2. OldestCycleSum = CurrentCycleSum
  3. Circular buffer index incremented by 1 (MOD N)

This one took a while to instrument properly. I first tried just adding some more columns on to the current display setup, but that became too cumbersome, too fast.  Verifying the running sum calculation requires looking at not only the current running sum, but also its value from N cycles (or N*5*4 samples) previously.  So, I modified the code to only print one line per cycle, and this was much easier to manage.  Here’s a partial printout showing a little over 200 cycles, representing about 400mSec (200 cycles * 2mSec/cycle).


To analyze these results, I dropped them into Excel, and used it’s ‘Freeze Panes’ feature to juxtapose the results from cycles 0 through 4 with cycles 65, 66, and 67.  This allowed me to verify that the running sum expression was being calculated correctly and the circular buffer was being loaded and referenced correctly.  When I finished verifying these results, I plotted the final value column, as shown below:


New ‘Final Value’ results

As an experiment, I also momentarily blocked the IR path with my finger, and plotted the results, as shown below:

Filter response with finger blocking IR path for a few seconds, then removed

And again, just waving my finger into and out of the IR path several times over several seconds

Filter response with finger moved in and out of IR path several times over 3-5 seconds

John’s comment on all this was “still not right”, as explained below:


“Think I misinterpreted earlier plot as the one below makes it clearer what is happening.  The flat segments are when sample timing is proper.  So 3 good sets of cycles samplesin a row is max before there is a relative phase slip of ~1/10 of 180degs of Tx signal or ~96us = (time_for_180deg = (1/520)/2) / 10.
The slips are thus equal to 10 raw samples which are lost every ~35 Tx cycles = (51-16) for CSumQ and ~34 Tx cycles (68-34) for CSumI.  So 20 raw samples (1 full Tx cycle) are lost every ~69 Tx cycles and this results in 64-cycle running I and Q sums being only 5/64 of 64 times the actual signal magnitude because all but 5 of the 64 values in running sums are canceled out by other values in the I and Q running sums.  Applying this to the running sum of abs(I)+abs(Q) gives ~35.8K which is close to what you are seeing — that is:  ~35,823 = (5/64)*(64*7164.667)
Seems almost certain that the calcs are correct and the problem is that the Teensy can’t keep up with the needed sample rate (probably due coding approach or interrupts but less likely “magic timing code” Tx and Rx were very stable on scope).  Back to thinking ADC –> DMA may be the cure”

My take on the above was that it is the print statements that are the cause of the problem, not anything to do with Teensy ADC speed.  To test this theory, I removed all but the ‘Final Value’ print statement, and put that one statement in a block that executed only once per 64 cycles.  With this change, I could see that the print output was keeping up with real time and not lagging as before. Below shows a plot with several IR beam block/unblock cycles, and it appears that the ‘Final Value’ output is much more responsive to the block/unblock events.  Interestingly, it appears there is some sort of Gibbs phenomenon associated with the ‘fast’ cycles when the system switches between one equilibrium level and another.

The next step is to remove the print statements entirely and route the ‘Final Value’ output to one of the two DAC lines on the Teensy 3.5.  To test this idea, I adapted the HobbyTronics sine wave generator code to my project (pin A21 on the Teensy 3.5 vs A14 on the Teensy 3.2), and got the following display on my ‘scope:

Output from Teensy 3.5 DAC0 (A21) pin

So, now I know that the DAC output works – now I just need to scale the ‘Final Value’ numbers correctly and connect them to the DAC output.  Then I can watch the filter action in real time without worrying about the impact of print statements.

With an input square wave amplitude of about 0.35V p-p, or about 0.35/3.3 = 0.106 FS, the output is about 29,000.  Therefore the peak output from the FV stage should be about 29,000/0.106 ~ 273,584.  so an input of 0.35/3.3 = 0.106 should produce an output of (29,000/273584) *3.3 = 0.106*3.3 = 0.35

So, I modified my Sinewave test code to output the Final Value number, scaled by 273,584, and got the following display.  In this test, the input amplitude is about 0.35V p-p, and the output is about 0.35VDC




Stay tuned,










990 Total Views 2 Views Today

Leave a Reply

Your email address will not be published. Required fields are marked *