Skip to content

Instantly share code, notes, and snippets.

@threedaymonk
Last active September 27, 2023 12:42
Show Gist options
  • Star 21 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save threedaymonk/701ca30e5d363caa288986ad972ab3e0 to your computer and use it in GitHub Desktop.
Save threedaymonk/701ca30e5d363caa288986ad972ab3e0 to your computer and use it in GitHub Desktop.
Roland SP-404SX sample format

Roland SP-404SX sample file format

Notes

Field types are marked using C-style notation:

  • char[4] indicates a 4-byte fixed-width string
  • uint[6] indicates a 6-byte sequence
  • uint8, uint16, and uint32 are unsigned byte, short, and long integers
  • int8, int16, and int32 are signed byte, short, and long integers
  • long double is an 80-bit IEEE 754 floating-point number.

Fields that vary are marked in bold. All other fields should be constant across samples.

File name

  • Sample files are stored in SP-404SX/ROLAND/SP-404SX/SMPL/
  • Sample names are in the form A00000001.WAV, i.e. one letter for the bank followed by seven digits, padded with leading zeros, for the pad number.
  • Samples can be either WAVE (.WAV) or AIFF (.AIF).

AIFF sample file format

This format is generated by the sampler when recording samples to the SD card.

Samples are 44.1 kHz 16-bit PCM files in AIFF format with an additional Roland-specific header. They can be either mono or stereo.

The sample is an AIFF file with the structure:

  • AIFF chunk
    • COMM chunk
    • RLND chunk
    • SSND chunk

This is a big-endian format, i.e. the most significant byte comes first and bytes appear in the order written. For example, 176,400 is 2B110h and appears in the file as 00 02 B1 10.

AIFF chunk

Field Type Bytes Data Notes
ChunkID char[4] 4 FORM
ChunkSize uint32 4 00 0D BF 78 see below
Format char[4] 4 AIFF

Notes:

  • ChunkSize is the length of the file remaining, i.e. the total file length in bytes minus eight.

COMM chunk

Field Type Bytes Data Notes
ChunkID char[4] 4 fmt
ChunkSize uint32 4 00 00 00 12
NumChannels uint16 2 00 02 1 or 2
NumSamples uint32 4 00 03 6F 60 see below
BitsPerSample uint16 2 00 10 16-bit
SampleRate long double 10 40 0E AC 44 00 00 00 00 00 00 44100 Hz

Notes:

  • Chunk size is the number of bytes remaining in the chunk. It is always 18.
  • NumChannels is 1 for mono, 2 for stereo.
  • NumSamples is the number of samples in the audio data. For 16-bit stereo audio, this will be 1/4 of the length of the audio data in bytes.

APPL chunk

Field Type Bytes Data Notes
ChunkID char[4] 4 APPL
ChunkSize uint32 4 00 00 01 C2
Signature char[4] 4 RLND
Device char[8] 8 roifspsx
Unknown uint8[3] 6 00 00 00
SampleIndex uint8 1 04 see below
Padding uint8[] 434 00 00 ... all zeros

Notes:

  • SampleIndex is always 4 for samples recorded on the device.
  • This chunk is sized and padded with zeros to ensure that the the sample data starts exactly at offset 512.

SSND chunk

Field Type Bytes Data Notes
ChunkID char[4] 4 SSND
ChunkSize uint32 4 00 0D BD 88 see below
Offset uint32 4 00 00 00 00 always 0
BlockSize uint32 4 00 00 00 00 always 0
Data int16[] ?

Notes:

  • ChunkSize is equivalent to the number of samples ×
  • Data always starts at offset 512 in the file.
  • Data is a sequence of samples. Each sample is a pair of left and right values for a stereo file.
  • ChunkSize is 8 + NumSampleFrames × NumChannels × BitsPerSample / 8

WAVE sample file format

This format is generated by the Roland Wave Converter application.

Samples are 44.1 kHz 16-bit PCM files in RIFF WAVE format with an additional Roland-specific header. They can be either mono or stereo.

The sample is a RIFF file with the structure:

  • RIFF chunk
    • fmt chunk
    • RLND chunk
    • data chunk

This is a little-endian format, i.e. the least significant byte comes first and bytes appear in reverse order. For example, 176,400 is 2B110h and appears in the file as 10 B1 02 00.

RIFF chunk

Field Type Bytes Data Notes
ChunkID char[4] 4 RIFF
ChunkSize uint32 4 20 37 02 00 see below
Format char[4] 4 WAVE

Notes:

  • ChunkSize is the length of the file remaining, i.e. the total file length in bytes minus eight.

fmt chunk

Field Type Bytes Data Notes
ChunkID char[4] 4 fmt
ChunkSize uint32 4 12 00 00 00 18 bytes
AudioFormat uint16 2 01 00 PCM
NumChannels uint16 2 02 00 1 or 2
SampleRate uint32 4 44 AC 00 00 44100 Hz
ByteRate uint32 4 00 1B 02 00 see below
BlockAlign uint16 2 04 00 see below
BitsPerSample uint16 2 10 00 16-bit
ExtraParamSize uint16 2 00 00

Notes:

  • The chunk name is f m t space.
  • Chunk size is the number of bytes remaining in the chunk. It is always 18.
  • NumChannels is 1 for mono, 2 for stereo.
  • ByteRate is computed as SampleRate × NumChannels × BitsPerSample / 8, or more simply as SampleRate × BlockAlign.
  • BlockAlign is computed as NumChannels × BitsPerSample / 8, i.e. 2 for a mono sample and 4 for stereo.

RLND chunk

Field Type Bytes Data Notes
ChunkID char[4] 4 RLND
ChunkSize uint32 4 CA 01 00 00
Device char[8] 8 roifspsx
Unknown uint8[3] 3 00 00 00
SampleIndex uint8 1 05 see below
Padding uint8[] 443 00 00 ... all zeros

Notes:

  • SampleIndex starts at zero for A1 and increases by 12 for each bank, i.e. A1 = 00h, A5 = 04h, B5 = 10h.
  • This chunk is sized and padded with zeros to ensure that the the sample data starts exactly at offset 512.

data chunk

Field Type Bytes Data Notes
ChunkID char[4] 4 data
ChunkSize uint32 4 see below
Data int16[] ?

Notes:

  • ChunkSize is equivalent to the number of samples × BlockAlign
  • Data always starts at offset 512 in the file.

Pad info file

This is stored alongside the samples in PADINFO.BIN and contains 120 × 32-byte records, one for each pad from A1 to J12.

In this file, values are stored in big-endian order, i.e. an offset of 512 is 200h and stored as 00 00 02 00.

Field Type Bytes Data Notes
OrigSampleStart uint32 4 00 00 02 00
OrigSampleEnd uint32 4 00 02 37 28
UserSampleStart uint32 4 00 00 02 00
UserSampleEnd uint32 4 00 02 37 28
Volume uint8 1 7F 0-127
Lofi uint8 1 00 0/1
Loop uint8 1 00 0/1
Gate uint8 1 01 0/1
Reverse uint8 1 00 0/1
Format uint8 1 01 see below
Channels uint8 1 02 1 or 2
TempoMode uint8 1 00 see below
OrigTempo uint32 4 00 00 04 B0 see below
UserTempo uint32 4 00 00 04 B0 see below

Notes:

  • TempoMode is 0 = off, 1 = pattern, 2 = user.
  • Format is 0 for an AIFF sample, and 1 for a WAVE sample. It is possible that this corresponds solely to the endianness of the data (0 = big endian, 1 = little endian).
  • OrigTempo and UserTempo appear to be beats per minute mutiplied by 10, i.e. 4B0h = 1200 = 120 bpm. The Roland Wave Converter application computes the original tempo as 120 / sample length.
  • Sample start and end offsets are relative to the original file: Wave Converter sets this to 512 because the length of the RIFF headers before the raw wave data is exactly 512 bytes.

References

@threedaymonk
Copy link
Author

What is the definition of the sample length ? Do you have any reference for the statement above or this is something you figured out yourself?

I figured it out myself through observation. Sample length would be the number of samples, i.e. length in bytes divided by 4 for stereo or length in bytes divided by 2 for mono.

@MatthewCallis
Copy link

I found that the RLND Unknown chunk was 4 bytes (or one large int) when working with own PAD_INFO.BIN file. I'm curious if they changed the header in a software revision as I couldn't get anything but the latest version to run. I took what you wrote and added it to my SP-404SX sample manager and libs, thank you so much for taking the time to document the formats!

https://github.com/MatthewCallis/super-pads
https://github.com/uttori/uttori-audio-padinfo
https://github.com/uttori/uttori-audio-wave

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment