Numerically controlled oscillators
Thursday, 2015-07-16, 22:30In my latest project I’ve been using numerically controlled oscillators to generate the waveforms. In this post I want to explain the motivation and theory behind them, with a few comments specifically about audio synthesis.
Motivation
We want to output some periodic signal, e.g. a sine with the frequency
frequency
, based on the current time. Let’s say we have some counter
that increments periodically, cycle
, and its frequency f_clk
, as
an integer (for now). cycle
could on the Arduino platform just be a
call to micros()
, the number of microseconds since the program
started running. The f_clk
in that case would be 1000000 - a million
clock cycles (or microseconds) per second.
In this example we’ll just be generating a sine, but you can use whatever you want, as long as it’s periodic.
uint32_t cycle;
int8_t next_signal() {
return 127 * sin(2 * pi * frequency * cycle / f_clk);
}
Note that there’s two different frequencies involved here -
frequency
, which is the frequency of the signal we want to generate,
and f_clk
, the frequency of our counter.
Problem: sin()
is slow if your processor doesn’t have a floating
point unit. If you try running this on an 8-bit AVR microcontroller
clocked at 16 MHz, calculating this function even once is going to
take a few milliseconds. That’s not very useful for most frequencies -
if you can generate 200 amplitudes per second, that limits you to
frequencies below 100 Hz, as per the
sampling theorem.
Getting to an NCO, slowly
Add a lookup table
So, we want to make this faster. Our first idea is to build a lookup table (LUT) and get an index into that. It should cover one period of our function. Calculating this will be slow, but it only has to run once.
uint32_t cycle;
int8_t lut[256];
void build_lut() {
for(int i = 0; i < 256; i++) {
lut[i] = 127 * sin(2 * pi * i / 256.);
}
}
As you can see, the previous frequency * cycle / f_clk
in the call
to sin()
was replaced by i / 256.
, so our index i
can be
calculated as 256 * frequency * cycle / f_clk
:
int8_t next_signal() {
return lut[(uint8_t)(256 * frequency * cycle / f_clk)];
}
frequency * cycle / f_clk
is the current phase. Multiplying that by
256 gets you the appropriate entry in our LUT in the low 8 bits of the
integer component, which you can get at by casting the expression to
an unsigned 8-bit integer.
Much better already - the slow sine was replaced by some multiplications and a division. But let’s look at that a little more closely:
- The cycle counter is going to overflow sooner or later, leading to a discontinuity in the phase. You can prevent (or at least delay) that by using a counter with more bits, but that’s slow.
- The index calculation can be done entirely without floating point values, but that restricts the frequency of our generated signal to integers and you lose the ability to do frequencies like 0.5 Hz.
Exploit the periodic calls to next_signal()
What if next_signal()
gets called periodically as well, e.g. in an
interrupt tied to a timer? The indexes of the LUT will change based on
the number of cycles between each interrupt:
index_{n} = index_{n-1} + frequency * interrupt_period / f_clk
With this, we can change the index calculation to a simple floating point addition and a conversion to int.
int8_t lut[256];
float phase;
float increment;
void set_frequency(float f) {
increment = 256 * frequency * interrupt_period / f_clk;
}
int8_t next_signal() {
phase += increment;
while(phase >= 256) {
phase -= 256;
}
return lut[(int) phase];
}
There’s no more overflow because we decrease the phase after it’s surpassed the maximum index of 255, and it’s a bit faster than the previous example because now there’s only an addition (or two) instead of a few multiplications.
Remove all floating-point arithmetics
But with fixed point arithmetics it gets even faster: Take an N-bit phase accumulator and use its highest M bits (here: 8) as the index into the LUT. Each time our interrupt runs, this accumulator increases by some fixed value that depends on the frequency of our output waveform and the frequency of the interrupt:
uint32_t increment;
void set_frequency(float f) {
float f_int = f_cpu / interrupt_period;
increment = pow(2, 32) / f_int * frequency;
}
If we want a 1 Hz sine, the LUT needs to jump back to its initial
position after one second, meaning that the phase accumulator
overflows and returns to zero after one second. To do this, it needs
to increase by 2^N divided by the frequency of the interrupt f_int
each time the interrupt gets called - after half a second it will have
run f_int/2
times, and the accumulator will be at (2^32 / f_int) * (f_int/2) = 2^31
, exactly in the middle of its period, and after one
second it will of course be at (2^32 / f_int) * f_int = 2^32
and
overflow.
To get different frequencies, multiply the previous increment with the new frequency - higher multipliers lead to a higher increment, meaning a faster overflow and a faster frequency.
The final step is using the LUT again:
uint32_t phase_accumulator;
int8_t next_signal() {
phase_accumulator += increment;
return lut[phase_accumulator >> 24];
}
Now the phase calculation is just an integer addition, followed by a load from memory to get the amplitude of the current waveform. The frequencies of our waveform and interrupt can be floating point values as they’re only used when setting a new frequency. If you’re dealing with tone generation and only need halftones, you can even generate the phase increments of the highest octave you’re going to see and then right-shift those to get the lower octaves, since each octave doubles the frequencies and phase increments of the previous octave.
Modulating the signal
While usually not very useful on slower microcontrollers because of the high frequencies involved (e.g. upwards of 87 MHz for FM radio), it’s quite easy to modulate the new waveforms - amplitude, phase, and frequency modulation are all rather easy to implement.
Amplitude modulation
With this, you can change the volume of generated tones, so it’s still useful on feeble microcontrollers that’re getting forced to play music :)
To do amplitude modulation, just multiply the result from the lookup with the current amplitude of the unmodulated signal:
uint8_t current_amplitude;
int8_t next_signal() {
phase_accumulator += increment;
return (current_amplitude * lut[phase_accumulator >> 24]) >> 8;
}
Make sure that the unmodulated signal (or your LUT) is an unsigned integer, or right-shift by 7 instead of 8.
Frequency modulation
This one’s also easy: Just pick a maximum deviation from your default
frequency f_d
, multiply the signal with it, and add that frequency
to your increment (with the same arithmetics behind it).
uint8_t amplitude;
uint32_t increment_f_d;
void set_frequency_deviation(float fd) {
float f_int = f_cpu / interrupt_period;
increment_f_d = pow(2, 32) / f_int * fd;
}
int8_t next_signal() {
phase_accumulator += increment + (increment_f_d * amplitude) >> 8;
return lut[phase_accumulator >> 24];
}
Adding to the phase increment increases the frequency by some value
between 0 Hz and the frequency deviation, the rest is just like in the
unmodulated example. To change it so our carrier frequency is in the
middle of the band, make current_amplitude
a signed value and only
shift the multplied increment by 7 bits instead of 8.
Phase modulation
For this one you’ll also need another addition, but this time one step
later: Phase modulation works with the phase accumulator instead of
the phase increment. Analogous to frequency modulation, you can set
the maximum phase deviation. In this case it has to be less than 2 pi / 256
- I’ll leave figuring out a way around that as an exercise to
the reader. I’m going to use a phase in the range [0:1]
to get rid
of the factor 2 pi
.
uint8_t amplitude;
uint32_t phase_dev;
void set_phase_deviation(float pd) {
phase_dev = pow(2, 32) * pd;
}
int8_t next_signal() {
phase_accumulator += increment;
uint32_t modulated = phase_accum + (phase_dev * amplitude) >> 8;
return lut[modulated >> 24];
}
Remarks
The maximum frequency you can generate with this NCO is half of the
interrupt frequency, and the resolution is f_int / 2^N
- for my
starting example of an Arduino running at 16 MHz with an interrupt
every 256 cycles and a 16-bit accumulator, that’s a maximum frequency
of 31.25 kHz, which is decent enough for audio. The frequency
resolution is about half a Hertz. Human hearing has a resolution of
about 3.6 Hz, so we’re in the clear with that as well, at
least down to about 125 Hz. Even higher resolutions are of course
easily possible with 24 or even 32-bit accumulators.
On the AVR instruction set, the most basic NCO processing can be implemented in about 20 cycles, with another 6 cycles if you want to do amplitude modulation. This is enough to process all six PWM channels of an ATMega328 separately, and still have time for other things - more pleasant volume changes, MIDI processing, etc.
Of shifts and unions
One thing to note when using GCC on an AVR microcontroller is that it
sometimes actually does 24 right-shifts when you do n >> 24
, so it’s
better to do it with unions if speed is important, just to make sure:
union {
uint32_t i;
uint8_t c[4];
} phase;
int8_t next_signal() {
phase.i += increment;
return lut[phase.c[3]];
}
A basic AVR assembly implementation
Here’s an asm version of next_signal()
, for a 16-bit phase
accumulator. It’ll need 23 cycles (19 without the ret
, or a bit more
if you want to do things when the phase returns to 0.
next_signal:
; Every register used in this function is
; call-clobbered, so it doesn't have to save
; anything.
; Load the low byte of phase and increment
lds r30, phase + 0
lds r31, increment + 0
; Add them
add r30, 31
; Save the low byte of the phase
sts phase + 0, r30
; Same for the high byte, but add with carry
; If you want more than 16 bits for your
; accumulator, repeat this part with changed
; constants (+2, +3...)
lds r30, phase + 1
lds r31, increment + 1
adc r30, r31
sts phase + 1, r30
; Skip the following part if no overflow
; happened. You can just remove this part
; if you don't want to do anything here.
brcc load_amplitude
; Here you can do stuff that should only
; happen when the phase is (almost) zero
load_amplitude:
; Set the high byte of Z to 0, then add
; the address of the LUT to Z
ldi r31, 0
subi r30, lo8(-(lut))
sbic r31, hi8(-(lut))
; Load the amplitude into the return
; register, r24, then return.
ld r24, Z
ret
For a more complete example that handles more than one oscillator, you can take a look at the corresponding function in my MIDI synth
If you have any comments or questions, feel free to contact me.