Nokia Composer Ringtone Synthesizer in 512 Bytes

A little bit of nostalgia in our new translation – trying to write Nokia Composer and compose our own melody.

Did any of your readers use an old Nokia, for example, models 3310 or 3210? You should remember its great feature – the ability to compose your own ringtones right on the phone keyboard. By arranging notes and pauses in the desired order, you could play a popular melody from the phone speaker and even share the creation with friends! If you missed that era, this is what it looked like:

Did not impress? Just trust me, it sounded really cool back then, especially for those who were into music.

The musical notation (musical notation) and format used in Nokia Composer is known as RTTTL (Ring Tone Text Transfer Language). RTTL is still widely used by hobbyists to play monophonic ringtones on Arduino and others.

RTTTL allows you to write music for only one voice, notes can only be played sequentially, without chords and polyphony. However, this limitation turned out to be a killer feature, since such a format is easy to write and read, easy to analyze and reproduce.

In this article, we’ll try to create an RTTTL player in JavaScript, adding a little bit of golfing code and math to keep the code as short as possible for fun.

Parsing RTTTL

For RTTTL, a formal grammar is used. RTTL format is a string consisting of three parts: the name of the melody, its characteristics, such as tempo (BPM – beats per minute, that is, the number of beats per minute), octave and duration of the note, as well as the melody code itself. However, we will simulate the behavior of Nokia Composer itself, parse only a part of the melody and consider the BPM tempo as a separate input parameter. The name of the melody and its service characteristics are left outside the scope of this article.

A melody is simply a sequence of notes / rests, separated by commas with additional spaces. Each note consists of a length (2/4/8/16/32/64), a pitch (c / d / e / f / g / a / b), an optional sharp sign (#) and the number of octaves (from 1 to 3 since only three octaves are supported).

The easiest way is to use regular expressions… New browsers come with a very handy feature matchAllwhich returns a set of all matches in a string:

const play = s => {
  for (m of s.matchAll(/(d*)?(.?)(#?)([a-g-])(d*)/g)) {
    // m[1] is optional note duration
    // m[2] is optional dot in note duration
    // m[3] is optional sharp sign, yes, it goes before the note
    // m[4] is note itself
    // m[5] is optional octave number
  }
};

The first thing to figure out about each note is how to convert it to the frequency of sound waves. We can of course create a HashMap for all seven note letters. But since these letters are in sequence, it should be easier to think of them as numbers. For each letter-note, we find the corresponding numeric character code (code ASCII). For “A” this will be 0x41, and for “a” it will be 0x61. For “B / b” it will be 0x42 / 0x62, for “C / c” it will be 0x43 / 0x63, and so on:

// 'k' is an ASCII code of the note:
// A..G = 0x41..0x47
// a..g = 0x61..0x67
let k = m[4].charCodeAt();

We should probably skip the high order bits, we will only use k & 7 as the note index (a = 1, c = 2,…, g = 7). What’s next? The next stage is not very pleasant, as it is related to music theory. If we have only 7 notes, then we count them as all 12. This is because the sharp / flat notes are unevenly hidden between the usual notes:

         A#        C#    D#       F#    G#    A#         <- black keys
      A     B | C     D     E  F     G     A     B | C   <- white keys
      --------+------------------------------------+---
k&7:  1     2 | 3     4     5  6     7     1     2 | 3
      --------+------------------------------------+---
note: 9 10 11 | 0  1  2  3  4  5  6  7  8  9 10 11 | 0

As you can see, the note index in octave increases faster than the note code (k & 7). In addition, it increases non-linearly: the distance between E and F or between B and C is 1 semitone, not 2, as between the rest of the notes.

Intuitively, we can try multiplying (k & 7) by 12/7 (12 semitones and 7 notes):

note:          a     b     c     d     e      f     g
(k&7)*12/7: 1.71  3.42  5.14  6.85  8.57  10.28  12.0

If we look at these numbers without the decimal places, we will immediately notice that they are non-linear, as we expected:

note:                 a     b     c     d     e      f     g
(k&7)*12/7:        1.71  3.42  5.14  6.85  8.57  10.28  12.0
floor((k&7)*12/7):    1     3     5     6     8     10    12
                                  -------

But not really … The “halftone” spacing should be between B / C and E / F, not between C / D. Let’s try other ratios (underlined semitones):

note:              a     b     c     d     e      f     g
floor((k&7)*1.8):  1     3     5     7     9     10    12
                                           --------

floor((k&7)*1.7):  1     3     5     6     8     10    11
                               -------           --------

floor((k&7)*1.6):  1     3     4     6     8      9    11
                         -------           --------

floor((k&7)*1.5):  1     3     4     6     7      9    10
                         -------     -------      -------

It is clear that the values 1.8 and 1.5 are not suitable: the first has only one semitone, and the second has too many. The other two, 1.6 and 1.7, seem to fit us well: 1.7 gives the major scale GA-BC-D-EF, and 1.6 gives the major scale AB-CD-EFG. Just what we need!

Now we need to change the values a little so that C is 0, D is 2, E is 4, F is 5, and so on. We should offset by 4 semitones, but subtracting 4 will make the A note below the C note, so instead we add 8 and calculate modulo 12 if the value is out of an octave:

let n = (((k&7) * 1.6) + 8) % 12;
// A  B C D E F G A  B C ...
// 9 11 0 2 4 5 7 9 11 0 ...

We must also take into account the “sharp” mark, which is caught by group m[3] regular expression. If present, increase the note value by 1 semitone:

// we use !!m[3], if m[3] is "https://habr.com/ru/company/timeweb/blog/536284/#" - that would evaluate to `true`
// and gets converted to `1` because of the `+` sign.
// If m[3] is undefined - it turns into `false` and, thus, into `0`:
let n = (((k&7) * 1.6) + 8)%12 + !!m[3];

Finally, we must use the correct octave. Octaves are already stored as numbers in regex group m[5]… According to music theory, each octave is 12 Seminots, so we can multiply the octave number by 12 and add to the note value:

// n is a note index 0..35 where 0 is C of the lowest octave,
// 12 is C of the middle octave and 35 is B of the highest octave.
let n =
  (((k&7) * 1.6) + 8)%12 + // note index 0..11
  !!m[3] +                 // semitote 0/1
  m[5] * 12;               // octave number

Clamping

What happens if someone indicates the number of octaves as 10 or 1000? This can lead to ultrasound! We should only allow the correct set of values for such parameters. Limiting the number between the other two is commonly called “clamping”. Modern JS has a special function Math.clamp (x, low, high), which, however, is not yet available in most browsers. The simplest alternative is to use:

clamp = (x, a, b) => Math.max(Math.min(x, b), a);

But because we’re trying to keep our code as small as possible, we can reinvent the wheel and stop using math functions. We use the default x = 0for clamping to work with undefined-values:

clamp = (x=0, a, b) => (x < a && (x = a), x > b ? b : x);

clamp(0, 1, 3) // => 1
clamp(2, 1, 3) // => 2
clamp(8, 1, 3) // => 3
clamp(undefined, 1, 3) // => 1

Note Tempo and Duration

We expect BPM to be passed as a parameter to the function out play ()… We just have to validate it:

bpm = clamp(bpm, 40, 400);

Now, to calculate how long a note should last in seconds, we can get its musical duration (whole / half / quarter / …), which is stored in the regex group m[1]… We use the following formula:

note_duration = m[1]; // can be 1,2,4,8,16,32,64
// since BPM is "beats per minute", or usually "quarter note beats per minute",
// BPM/4 would be "whole notes per minute" and BPM/60/4 would be "whole
// notes per second":
whole_notes_per_second = bpm / 240;
duration = 1 / (whole_notes_per_second * note_duration);

If we combine these formulas into one and limit the duration of the note, we get:

// Assuming that default note duration is 4:
duration = 240 / bpm / clamp(m[1] || 4, 1, 64);

Also, do not forget about the ability to specify notes with dots, which increase the length of the current note by 50%. We have a group m[2], the value of which can be point … or undefined… Using the same method that we used earlier for the sharp sign, we get:

// !!m[2] would be 1 if it's a dot, 0 otherwise
// 1+!![m2]/2 would be 1 for normal notes and 1.5 for dotted notes
duration = 240 / bpm / clamp(m[1] || 4, 1, 64) * (1+!!m[2]/2);

Now we can calculate the number and duration for each note. Time to take advantage API WebAudioto play a melody.

WEBAUDIO

We only need 3 parts of everything API WebAudio: audio context, oscillator for sound wave processing and gain node to turn on / off the sound. I will use a rectangular oscillator to make the melody sound like that awful old phone ringing:

// Osc -> Gain -> AudioContext
let audio = new (AudioContext() || webkitAudioContext);
let gain = audio.createGain();
let osc = audio.createOscillator();
osc.type="square";
osc.connect(gain);
gain.connect(audio.destination);
osc.start();

This code by itself will not create music yet, but when we parse our RTTTL melody, we can tell WebAudio which note to play, when, at what frequency and for how long.

All WebAudio nodes have a special method setValueAtTimethat schedules a value change event (frequency or node gain).

If you remember, earlier in the article we already had the ASCII code for the note stored as k, the note index as n, and we had duration (duration) of the note in seconds. Now, for each note, we can do the following:

t = 0; // current time counter, in seconds
for (m of ......) {
  // ....we parse notes here...

  // Note frequency is calculated as (F*2^(n/12)),
  // Where n is note index, and F is the frequency of n=0
  // We can use C2=65.41, or C3=130.81. C2 is a bit shorter.
  osc.frequency.setValueAtTime(65.4 * 2 ** (n / 12), t);
  // Turn on gain to 100%. Besides notes [a-g], `k` can also be a `-`,
  // which is a rest sign. `-` is 0x2d in ASCII. So, unlike other note letters,
  // (k&8) would be 0 for notes and 8 for rest. If we invert `k`, then
  // (~k&8) would be 8 for notes and 0 for rest. Shifing it by 3 would be
  // ((~k&8)>>3) = 1 for notes and 0 for rests.
  gain.gain.setValueAtTime((~k & 8) >> 3, t);
  // Increate the time marker by note duration
  t = t + duration;
  // Turn off the note
  gain.gain.setValueAtTime(0, t);
}

It’s all. Our play () program can now play entire melodies written in RTTTL notation. Here is the complete code with some minor clarifications such as using v as a shortcut to setValueAtTime or using one letter variables (C = context, z = oscillator because it produces a similar sound, g = gain, q = bpm, c = clamp):

c = (x=0,a,b) => (x<a&&(x=a),x>b?b:x); // clamping function (a<=x<=b)
play = (s, bpm) => {
  C = new AudioContext;
  (z = C.createOscillator()).connect(g = C.createGain()).connect(C.destination);
  z.type="square";
  z.start();
  t = 0;
  v = (x,v) => x.setValueAtTime(v, t); // setValueAtTime shorter alias
  for (m of s.matchAll(/(d*)?(.?)([a-g-])(#?)(d*)/g)) {
    k = m[4].charCodeAt(); // note ASCII [0x41..0x47] or [0x61..0x67]
    n = 0|(((k&7) * 1.6)+8)%12+!!m[3]+12*c(m[5],1,3); // note index [0..35]
    v(z.frequency, 65.4 * 2 ** (n / 12));
    v(g.gain, (~k & 8) / 8);
    t = t + 240 / bpm / (c(m[1] || 4, 1, 64))*(1+!!m[2]/2);
    v(g.gain, 0);
  }
};

// Usage:
play('8c 8d 8e 8f 8g 8a 8b 8c2', 120);

When minified with terser this code is only 417 bytes long. This is still below the 512 byte threshold. Why don’t we add a function stop () to interrupt playback:

C=0; // initialize audio conteext C at the beginning with zero
stop = _ => C && C.close(C=0);
// using `_` instead of `()` for zero-arg function saves us one byte :)

This is still around 445 bytes. If you paste this code into developer console, you can play RTTTL and stop playback by calling JS functions play () and stop ()…

UI

I think adding a little UI to our synthesizer will make the moment of making music even more enjoyable. At this point, I would suggest forgetting about code golf. It is possible to create a tiny editor for RTTTL ringtones without saving bytes using normal HTML and CSS and including a play-only minified script.

I decided not to post the code here as it’s pretty boring. You can find it at github… Also you can try the demo here: https://zserge.com/nokia-composer/…

If the muse has left you and you don’t feel like writing music at all, try a few existing songs and enjoy the familiar sound:

ringtone Nokia
iPhone ringtoneif you like modern music more
Light my fire
Lose yourself
The Good, The Bad, and The Ugly
Rondo Alla Turca (Mozart)

By the way, if you actually composed something, share the url (all song and BPM are stored in the hash portion of the url, so saving / sharing your songs is as easy as copying or bookmarking the link.

I hope you enjoyed this article. You can follow the news at Github, in Twitter or subscribe through rss…