sf2-rmidi-specification

SF2 RMIDI Format Extension Specification

Original format was created by Microsoft and later expanded by the MIDI Manufacturers Association.

Original format expansion idea by Zoltán Bacskó of Falcosoft, later expanded by spessasus. Specification written by spessasus with the help of Zoltán.

Revision 1.22

Preamble

MIDI files have long-faced a significant challenge: different sounds on different devices. SF2 + MIDI combinations address this issue partially by ensuring that playing both files through an SF2-compliant synth results in the same sound being produced. The RMIDI format is not new; it was originally developed by Microsoft as a RIFF wrapper for MIDI files and later expanded by the MIDI Manufacturers Association to support embedding DLS sound banks. However, DLS is not widely used today, whereas the SoundFont2 (SF2) format serves a similar purpose and remains quite popular. The SF2 RMIDI format integrates MIDI and SF2 files into a single file, augmented with additional metadata. This document serves as a specification for this format extension. This version of RMIDI was created by Zoltán Bacskó of Falcosoft and implemented in Falcosoft SoundFont Midi Player 6. I am in contact with Zoltán, who granted permission for me to write this specification. If you find any part of this specification unclear, please reach out via this thread or file a GitHub issue in this repository. Also feel free to report any issues such as typos or expansions!

Design Goals

This extension has been designed with the following goals in mind:

Table of Contents

Terminology

This specification assumes familiarity with the SoundFont2 format and the Standard MIDI File (SMF) format. Additional terminology used in this specification includes:

Extension

The file extension is .rmi. The file type should be referred to as MIDI with embedded SF2, Embedded MIDI or SF2 RMIDI.

RIFF Chunk

The RMIDI format uses RIFF chunks to structure the data.

The RIFF format is unchanged from the original RMIDI format. Described here for completeness.

Each RIFF chunk in an RMIDI file follows this format:

Example chunk

52 49 46 46 05 00 00 00 48 65 6C 6C 6F 00

SF2 RMIDI File Specification

File Structure

An RMIDI file consists of:

Example file

The following file structure shows that:

  1. The bank offset is 1.
  2. Info chunks are encoded using UTF-8 encoding.
  3. The song’s title is “Never Gonna Give You Up.”
  4. The song’s artist is “Rick Astley.”
  5. The song’s creation date is “1987.”
  6. The song has an embedded sound bank.

Handling Differences

When the file structure deviates from the above:

  1. Any additional chunks after the specified ones should be ignored and preserved as-is.
  2. If the chunk order differs from this specification, the file should be rejected.
  3. If no soundfont bank is present, the file should use the main soundfont and assume a bank offset of 0, ignoring the DBNK chunk.
  4. If the soundfont bank uses the older DLS format, software not capable of reading DLS should reject the file.

Software that supports DLS should use the contained DLS and assume a bank offset of 1 or try to detect the bank offset since the older format does not specify the DBNK chunk.

The last two rules ensure backwards compatibility with the older RMIDI format.

INFO Chunk

The INFO chunk describes file metadata and the soundfont’s bank offset.

The INFO chunk may contain the following optional chunks:

Metadata Chunks

Below are the defined chunks containing additional information about the song:

Chunk Rules

The following rules apply to the INFO chunk:

  1. The order of chunks within the INFO chunk is arbitrary.
  2. Text chunks must contain a terminal zero byte.
  3. Chunks of length 0 are illegal and should be discarded.
  4. Unknown INFO chunks should be ignored and preserved as-is.
  5. If the IENC chunk is not specified, the software can use any encoding, but assuming utf-8 is recommended.
  6. If the MENC chunk is not specified, the software decides MIDI’s encoding.
  7. If the software can display the song’s name, it should use the INAM chunk if present, ignoring the MIDI track name.
  8. Compatible software may ignore all INFO chunks except the DBNK chunk for the most basic level of compatibility.
  9. The chunk size must be even, as specified in the general RIFF structure.
  10. The INFO chunk is optional. The software must not assume that the INFO chunk exists.

IENC Chunk Requirements

For Level 3 compatibility, software must support the following encodings (both lowercase and uppercase):

Software may decode other encodings but is not required to.

IPIC Chunk Requirements

For Level 4 compatibility, software must support the following image formats:

Other formats (e.g., gif, webp, ico) may also be supported but are not required.

DBNK Chunk

The DBNK chunk is an optional RIFF chunk within the RMIDI INFO List.

It describes the bank offset for the embedded sound bank.

It always has a length of two bytes, with these bytes forming a 16-bit, unsigned, little-endian number. If the chunk’s length is not two bytes or the number is out of range, the file should be rejected.

Current boundaries are: minimum: 0 and maximum: 127. The other byte is reserved for future use.

If no DBNK is specified, an offset of 1 is assumed by default. If the file does not contain any Sound bank (SF2 or DLS), the offset shall default to 0.

For general use, a bank offset of 0 is recommended as it allows bundling the soundfont and the MIDI without modification.

Embedded sound bank

The RMI file may come with an embedded SF2 or DLS SoundFont, usually after the INFO chunk. This sound bank provides the exclusive sounds used within the MIDI sequence, temporarily replacing given MIDI program and bank numbers with the presets contained within the sound bank.

Bank Offset

The bank offset adjusts every bank in the embedded sound bank except for bank 128 by adding itself to every patch’s wBank field.

For files without an embedded sound bank, the bank offset is ignored and assumed to be 0, regardless of the DBNK chunk if present.

For example:

If the resulting bank number exceeds 127 (except for drum kits) or is smaller than 0, then it should be turned into 0.

Player pseudo code

Below is a simple JavaScript-like code for a Level 1 RMIDI-compatible player.

Note: this code does not perform any checks and assumes that the file is valid and contains all three chunks, for the sake of simplicity.

const file = open("song.rmi");
// read RIFF
const chunk = readRIFF(file);
// skip 'RMID' string
chunk.data.seek(chunk.data.position + 4);
// read 'data' chunk
const midiChunk = readRIFF(chunk.data);
const midiFile = midiChunk.data;

// read the 'LIST' INFO chunk
const info = readRIFF(chunk.data);
// skip the 'INFO' string
const infoString = info.data.seek(info.data.position + 4);
const infoList = readLIST(info.data);

// bank offset is 1 by default
let bankOffset = 1;
// if DBNK exists
if(infoList.find(infoChunk => infoChunk.header === "DBNK")) {
    // DBNK is 2 bytes signed int 16
    bankOffset = infoList["DBNK"].toSignedInt16();
}

// clamp the bank offset
bankOffset = Math.min(Math.max(0, bankOffset), 127);

// read the sound bank (not as a riff chunk but copy the binary content)
const soundFont = chunk.slice(chunk.data.position, chunk.data.length - chunk.data.position);

// initialize the synthesizer
const player = new Player(soundFont);

// adjust bank offset
for(const preset of player.soundFont.presets)
{
    preset.bankNumber += bankOffset;
}

// play the song
player.play(midiFile);

Software Requirements

Not all chunks in the file must be read for the file to play correctly. Software compatibility with the RMIDI format is categorized into levels:

Level 1

Minimum requirements for the software to be compliant. The software must:

This level ensures the correct playback and is recommended for software that does not need to support metadata.

Level 2

This level requires basic interpretation of the INFO chunk. The software must:

Level 3

This level requires support for the IENC chunk. The software must:

As of 2024-08-07, Falcosoft Midi Player meets this level of compatibility.

Level 4

This level requires support for the IPIC chunk. The software must:

As of 2024-08-06, SpessaSynth meets this level of compatibility.

As of 2024-08-20, foo_midi meets this level of compatibility.

Types of RMIDI Files

There are currently two distinct types of RMIDI files that vary in their use cases.

Note that these have identical file structure; these vary only in the way they provide sounds for the sequence.

Self-Contained File

A self-contained file is defined as a SF2 RMIDI file which only refers to its own SoundFont bank,
and the said bank contains all and only the necessary presets to play the file. It is recommended to use DBNK of 0 for writing such files, but it is not required.

Writing self-contained RMIDI files is recommended, but not required.

External File

An external file is defined as a SF2 RMIDI file which relies on a complete sound bank loaded as a fallback with the embedded sound bank only containing special sound effects, specific to the file.

The software not capable of loading two sound banks at once (the main one and the embedded one) may reject the file.

This type of file usually uses bank 1 or greater, but it may use bank 0.

Recommendations for Writing RMIDI Files

The following recommendations are not required for file validity but are advised:

  1. Trim the soundfont to include only presets and samples used in the file to save space.
  2. Write a self-contained file to ensure that it will sound the same in every software.
  3. Always include the DBNK chunk, even if the offset is 1.
  4. Include the IENC chunk to ensure correct encoding is used.
  5. Include the MENC chunk if the encoding is known, to help other software read the MIDI text events correctly.
  6. Use the utf-8 encoding for the metadata chunks if possible.
  7. Use SoundFont3 compression if available to save space.

Example Files

The directory examples contains RMIDI Files for testing:

Reference Implementation

Below is SpessaSynth implementation of the format in JavaScript, which may be useful for developers:

This document is in no way endorsed or otherwise affiliated with the MIDI Manufacturers Association, Microsoft, Creative Technology Ltd. or E-mu Systems, Inc., or any other organization mentioned in this document.

SoundFont® is a registered trademark of Creative Technology Ltd.

All other trademarks are the property of their respective owners.