Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Yamaha YSFC file format
Brief notes on the Yamaha-YSFC file format (version 1.0.2)
==========================================================
All Motif XF native files (X3A, X3G, X3V, etc) have an identical structure. The
extension just identifies which data types are contained within that structure
for the purposes of user interface.
32-bit and 16-bit quantities mentioned below are unsigned and stored in
big-endian byte ordering unless otherwise described.
Header
------
The file header looks like:
00000000 59 41 4d 41 48 41 2d 59 53 46 43 00 00 00 00 00 |YAMAHA-YSFC.....|
00000010 31 2e 30 2e 32 00 00 00 00 00 00 00 00 00 00 00 |1.0.2...........|
00000020 nn nn nn nn ff ff ff ff ff ff ff ff ff ff ff ff |................|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
Earlier versions of the Motif seem to produce versions 1.0.1 and 1.0.0 with
an identical header, and one slight format difference. (See below under 'Entry
lists'.)
nn nn nn nn is the (big-endian) size in bytes of the catalogue which follows.
Catalogue
---------
Top-level blocks of data within these files are identified by four letter chunk
IDs, for instance EVCE or DPFM. The catalogue consists of 8 byte entries: each
being the four byte identifier followed by the byte offset of that block within
the file.
Here is an example catalogue from the Yamaha InspirationInAFlash.X3A file:
00000040 45 4d 4c 54 00 00 01 30 45 4d 53 54 00 00 08 83 |EMLT...0EMST....|
00000050 45 50 46 4d 00 00 29 89 45 56 43 45 00 00 bc e1 |EPFM..).EVCE....|
00000060 45 57 46 4d 00 01 94 b4 45 41 52 50 00 01 ed 66 |EWFM....EARP...f|
00000070 45 50 54 4e 00 02 11 a9 45 53 4e 47 00 02 12 61 |EPTN....ESNG...a|
00000080 45 53 59 53 00 02 14 2c 45 46 56 54 00 02 14 68 |ESYS...,EFVT...h|
00000090 45 53 43 48 00 02 15 10 45 50 43 48 00 02 15 52 |ESCH....EPCH...R|
000000a0 45 50 4d 54 00 02 16 0a 45 53 4d 54 00 02 16 c2 |EPMT....ESMT....|
000000b0 45 57 49 4d 00 02 18 8d 44 4d 4c 54 00 02 71 3f |EWIM....DMLT..q?|
000000c0 44 4d 53 54 00 03 aa cb 44 50 46 4d 00 04 c2 d7 |DMST....DPFM....|
000000d0 44 56 43 45 00 0b fa e3 44 57 46 4d 00 1f 2b 7f |DVCE....DWFM..+.|
000000e0 44 41 52 50 00 23 58 47 44 50 54 4e 00 24 70 62 |DARP.#XGDPTN.$pb|
000000f0 44 53 4e 47 00 25 cc c2 44 53 59 53 00 29 70 ec |DSNG.%..DSYS.)p.|
00000100 44 46 56 54 00 29 75 38 44 53 43 48 00 29 a8 70 |DFVT.)u8DSCH.).p|
00000110 44 50 43 48 00 29 a8 e8 44 50 4d 54 00 29 ab d3 |DPCH.)..DPMT.)..|
00000120 44 53 4d 54 00 29 c9 43 44 57 49 4d 00 2a 17 af |DSMT.).CDWIM.*..|
Blocks
------
Directly after the catalogue, the data blocks follow one by one until the end
of the file. Each data block has the following overall format:
4 bytes identifier, e.g. "EMLT"
4 bytes (unsigned big-endian) length of payload (n)
n bytes payload
There are two types of blocks, which are always included in matching pairs of
each type. A list of entries has a type beginning with 'E', e.g. "EVCE", and
its corresponding data has a type beginning with 'D', e.g. "DVCE". The 'E'
block should always preceed the 'D' block in the file. Otherwise, block order
doesn't seem to matter.
Entry lists
-----------
An entry list block is effectively an index pointing into the corresponding
data block. After the initial Exxx identifier and block length, they begin with
a four byte count of entries, followed by the entries one by one:
4 bytes magic "Entr"
4 bytes length of this entry, excluding these first 8 bytes
4 bytes unknown (apparently always zero)
4 bytes size of associated data in the corresponding Dxxx block (see below)
4 bytes unknown (apparently always zero in Motif output; usually zero in
third-party files but not always)
4 bytes offset of associated data in the corresponding Dxxx block
4 bytes program number (e.g. voice program, arpeggio number, song number)
2 bytes unknown (only 1 byte in versions 1.0.1 and below)
NUL terminated entry name, e.g. "0:256:Natural Grand S6\x00"
NUL terminated filename, e.g. "000-Voice.vce\x00"
For voice entries, there then follow the NUL terminated filenames associated
with any user waveforms which the voice depends upon. This is presumably a
special case for voices and waveforms: there does not appear to be equivalent
dependency information for arpeggios which voices depend upon, nor voices
which performances, songs and patterns depend upon and so on.
All the bytes described as unknown can apparently be set to zero safely
without causing any problems, so we ignore these on both input and output.
Data blocks
-----------
Following the initial Dxxx identifier and block length, the data blocks also
have a four byte count of entries, followed by a sequence of Data chunks
corresponding to the Entr chunks in the Exxx block.
Each Data chunk has 4 bytes magic "Data", followed by 4 bytes length
(excluding these first 8 bytes, as with Entr chunks) and then the data itself.
The offset stored in the Entr chunk points eight bytes into the Data chunk,
i.e. to the beginning of the data itself. The size at the start of the Data
chunk should match that stored in the Entr chunk and is the size of the raw
data excluding the initial eight bytes of magic-plus-length.
Block types
-----------
[NNN below are zero-padded and numbered from zero, i.e. 000, 001, ...]
ESYS, DSYS - system configuration
(exactly one Entr, number 0, name System, filename system.sys)
ESCH, DSCH - song chain
(exactly one Entr, number 0, name SongChain, filename songchain.sch)
EARP, DARP - user arpeggios (filenames NNN-Arpeggio.arp)
EMLT, DMLT - mix templates (filenames NNN-Template.mlt)
EMST, DMST - master mode programs (filenames NNN-Master.mst)
EPFM, DPFM - performances (filenames NNN-Performance.pfm)
EWFM, DWFM - waveforms (filenames NNNN-Waveform.wfm, small => metadata?)
EWIM, DWIM - waveforms (filenames NNNN-Waveform.wim, big => wave data?)
ESNG, DSNG - songs (filenames NNN-Song.sng)
ESMT, DSMT - song mixings (filenames NNN-Song.smt)
EPTN, DPTN - patterns (filenames NNN-Pattern.ptn)
EPMT, DPMT - pattern mixings (filenames NNN-Pattern.pmt)
EPCH, DPCH - pattern chains (filenames NNN-Pattern.pch)
EFVT, DFVT - favorites:
0x00 "Favorite" favourite.fvt
0x01 "Favorite" arpeggio.fvt
0x02 "Favorite" waveform.fvt
EVCE, DVCE - voices (filenames XXXXXX-Voice.vce, e.g. 3F0800-Voice.vce)
0x000000 - 0x00007f - GM
0x3f0000 - 0x3f007f - PRE1
0x3f0100 - 0x3f017f - PRE2
0x3f0200 - 0x3f027f - PRE3
0x3f0300 - 0x3f037f - PRE4
0x3f0400 - 0x3f047f - PRE5
0x3f0500 - 0x3f057f - PRE6
0x3f0600 - 0x3f067f - PRE7
0x3f0700 - 0x3f077f - PRE8
0x3f0800 - 0x3f087f - USR1
0x3f0900 - 0x3f097f - USR2
0x3f0a00 - 0x3f0a7f - USR3
0x3f0b00 - 0x3f0b7f - USR4
0x3f2000 - 0x3f207f - PREDR
0x3f2800 - 0x3f287f - USRDR
0x3f8000 - 0x3fbfff - song sample (00-7f) and mixing (80-ff) voices
0x3fc000 - 0x3fffff - pattern sample (00-7f) and mixing (80-ff) voices
0x7f0000 - 0x7f0000 - GMDR
GM, PRE1-8, PREDR and GMDR are preset and are not included in [ED]VCE data.
Categories
----------
Category information is included at the beginning of the names of waveforms,
voices, arpeggios and performances. For instance, Natural Grand S6 is actually
named "0:256:Natural Grand S6" and Piano Electro is "36:Piano Electro".
Waveforms, arpeggios and peformances have just one associated category, and
thus are named with a single non-negative integer prefix of the form "%d:".
However, voices can belong to a secondary category so have have two decimal
integers "%d:%d:" as a prefix. The secondary category for a voice is set to
256 if it isn't required.
Extracting arp information
--------------------------
Arp numbers are stored in the data block of each voice (DVCE), performance
(DPFM) and multi (pattern/song/mixing template: DPMT/DSMT/DMLT). They are
stored as little-endian 16-bit integers. 0x0000 is arp-off, 0x0001 is preset
arp 1, 0x0002 is preset arp 2, 0x1ec9 (i.e. bytes "\xc9\x1e") is preset arp
7881, 0x2000 is user arp 1, 0x2001 is user arp 2, etc.
Measuring offsets with zero pointing to the first real data block (i.e.
ignoring the "Data" magic and size word), in normal voice DVCE blocks, arp
number n = 0, ..., 4 is stored in the two bytes at offset 0x658 + 2n. However,
in drum voice DVCE blocks, arp n = 0, ..., 4 is stored at offset 0x2be8 + 2n.
In DPFM blocks, arp n = 0, ..., 4 of part p = 0, ..., 3 is stored at offset
0x2b8 + (0x38)p + 2n.
In DMLT, DPMT and DSMT blocks, arp n = 0, ..., 4 of part p = 0, ..., 15 is
stored at offset 0x648 + (0x38)p + 2n.
These locations can be read to include user arp dependencies with voices,
performances and mixings automatically, but also updated when renumbering user
arps without gaps in the resulting output file.
Extracting voice information
----------------------------
Voice numbers are stored in the data block of each performance (DPFM) and
multi (DPMT/DSMT/DMLT). They are stored as three bytes in big-endian order, as
described in the bank select section of the Motif XF Data List:
0x000000 - 0x00007f - GM
0x3f0000 - 0x3f007f - PRE1
0x3f0100 - 0x3f017f - PRE2
0x3f0200 - 0x3f027f - PRE3
0x3f0300 - 0x3f037f - PRE4
0x3f0400 - 0x3f047f - PRE5
0x3f0500 - 0x3f057f - PRE6
0x3f0600 - 0x3f067f - PRE7
0x3f0700 - 0x3f077f - PRE8
0x3f0800 - 0x3f087f - USR1
0x3f0900 - 0x3f097f - USR2
0x3f0a00 - 0x3f0a7f - USR3
0x3f0b00 - 0x3f0b7f - USR4
0x3f2000 - 0x3f207f - PREDR
0x3f2800 - 0x3f287f - USRDR
0x3f3200 - 0x3f32ff - sample voice
0x3f3c00 - 0x3f3c0f - mixing voice
0x7f0000 - 0x7f0000 - GMDR
(Note the difference in numbering to the numbering of voices in EVCE for
sample and mixing voices, as these numbers are local to a particular
song/pattern.)
For both parts p = 0, ..., 3 of a performance, and parts p = 0, ..., 15 of a
multi, the 3-byte voice number is stored at offset 0x160 + (0x4c)p, where
offset zero points to the first byte after the "Data" magic and size word.
These locations can be read to include user voice dependencies with
performances and mixings automatically, but also updated when renumbering user
voices without gaps in the resulting output file.
Waveform numbering
------------------
Program number zero is unused.
Waveforms in the USR bank are stored with program numbers 1 to 128 (filenames
0001-Waveform.wfm to 0128-Waveform.wfm) and are referred to in voice elements
as wave numbers 1 to 128 with wave select 1 = USR.
Waveforms in the FL1 bank are stored with program numbers 129 to 2176
(filenames 0129-Waveform.wfm to 2176-Waveform.wfm) and are referred to in
voice elements as wave numbers 1 to 2048 with wave select 2 = FL1.
Waveforms in the FL2 bank are stored with program numbers 2177 to 4224
(filenames 2177-Waveform.wfm to 4224-Waveform.wfm) and are referred to in
voice elements as wave numbers 1 to 2048 with wave select 3 = FL2.
#!/bin/python
import struct
import sys
def input(file = sys.stdin, types = None):
data = file.read(64)
if len(data) != 64 or data[0:16].rstrip('\x00') != "YAMAHA-YSFC" \
or data[36:64] != 28 * '\xff':
raise ValueError, "Invalid file format"
try:
version = tuple(map(int, data[16:32].rstrip('\x00').split('.')))
size = struct.unpack(">I", data[32:36])[0]
catalog = file.read(size)
assert len(version) == 3 and len(catalog) == size
except:
raise ValueError, "Truncated file"
if version[0:2] != (1, 0) or version[2] not in [0, 1, 2]:
raise ValueError, "Unsupported file format version"
blocks = {}
while True:
name = file.read(4)
if not name:
return version, blocks
elif len(name) != 4:
raise ValueError, "Truncated file"
elif not name.isalpha() or not name.isupper():
raise ValueError, "Invalid file format"
try:
size = struct.unpack(">I", file.read(4))[0]
if not types or name in types:
blocks[name] = file.read(size)
assert len(blocks[name]) == size
else:
file.seek(size, 1)
except:
raise ValueError, "Truncated file"
def unpack(version, entries, data):
count, cursor, items = struct.unpack_from(">I", entries)[0], 4, []
while (cursor < len(entries)):
magic, length = struct.unpack_from(">4sI", entries, cursor)
if magic != "Entr" or cursor + length + 8 > len(entries):
raise ValueError, "Invalid file format"
size, offset, number = struct.unpack_from(">4xI4x2I", entries, cursor + 8)
if version <= (1, 0, 1):
names = entries[cursor + 29:cursor + length + 8]
else:
names = entries[cursor + 30:cursor + length + 8]
names = names.rstrip('\x00').split('\x00')
item = { 'number': number, 'name': names[0], 'filename': names[1] }
if len(names) > 2:
item['depends'] = names[2:]
if data:
item['data'] = data[offset - 8:offset + size]
items.append(item)
cursor += length + 8
if len(items) != count:
raise ValueError, "Invalid file format"
else:
return items
def pack(version, items):
count, entries, data = 0, "", ""
for item in items:
size = len(item['data']) - 8
if version <= (1, 0, 1):
entry = struct.pack(">4xI4x2I1x", size, len(data) + 12, count)
else:
entry = struct.pack(">4xI4x2I2x", size, len(data) + 12, count)
entry += "%s\x00%s\x00" % (item['name'], item['filename'])
if item.get('depends'):
entry += '\x00'.join(item['depends']) + '\x00'
entries += struct.pack(">4sI", "Entr", len(entry)) + entry
data += item['data']
count += 1
count = struct.pack(">I", count)
return count + entries, count + data
def output(version, blocks, file = sys.stdout):
file.write(struct.pack("16s", "YAMAHA-YSFC"))
file.write(struct.pack("16s", '.'.join(map(str, version))))
file.write(struct.pack(">I", 8*len(blocks)) + 28*'\xff')
names = blocks.keys()
names.sort(key = lambda n: '0' + n if n[0] == 'E' else '1' + n)
offset = 64 + 8*len(names)
for name in names:
file.write(struct.pack(">4sI", name, offset))
offset += 8 + len(blocks[name])
for name in names:
file.write(struct.pack(">4sI", name, len(blocks[name])))
file.write(blocks[name])
if len(sys.argv) < 2:
sys.stderr.write("""\
Usage: %s IN1[:NUM,NUM,..] [IN2[:NUM,NUM,..]].. >OUT
Combines the arps from Yamaha YSFC files IN1, IN2, ... into a single X0G/X3G
file OUT, selecting just the specified arp numbers if the option :NUM,NUM,..
suffices are supplied. Arps are numbered from 001 as in the Motif user
interface.
""" % sys.argv[0])
sys.exit(1)
out = []
for arg in sys.argv[1:]:
if ':' in arg:
arg, _, numbers = arg.rpartition(':')
numbers = [int(number) - 1 for number in numbers.split(',')]
else:
numbers = xrange(0xffffffff)
with open(arg, 'rb') as file:
version, blocks = input(file = file, types = ['EARP', 'DARP'])
arps = unpack(version, blocks['EARP'], blocks['DARP'])
out += [arp for arp in arps if arp['number'] in numbers]
for number, arp in enumerate(out):
arp['number'] = number
arp['filename'] = "%03d-Arpeggio.arp" % number
if len(out) > 256:
sys.stderr.write("Warning: Motif will refuse to read more than 256 arps\n")
if out:
earp, darp = pack(version, out)
output(version, { "EARP": earp, "DARP": darp })
#!/bin/python
import struct
import sys
def input(file = sys.stdin, types = None):
data = file.read(64)
if len(data) != 64 or data[0:16].rstrip('\x00') != "YAMAHA-YSFC" \
or data[36:64] != 28 * '\xff':
raise ValueError, "Invalid file format"
try:
version = tuple(map(int, data[16:32].rstrip('\x00').split('.')))
size = struct.unpack(">I", data[32:36])[0]
catalog = file.read(size)
assert len(version) == 3 and len(catalog) == size
except:
raise ValueError, "Truncated file"
if version[0:2] != (1, 0) or version[2] not in [0, 1, 2]:
raise ValueError, "Unsupported file format version"
blocks = {}
while True:
name = file.read(4)
if not name:
return version, blocks
elif len(name) != 4:
raise ValueError, "Truncated file"
elif not name.isalpha() or not name.isupper():
raise ValueError, "Invalid file format"
try:
size = struct.unpack(">I", file.read(4))[0]
if not types or name in types:
blocks[name] = file.read(size)
assert len(blocks[name]) == size
else:
file.seek(size, 1)
except:
raise ValueError, "Truncated file"
def unpack(version, entries, data):
count, cursor, items = struct.unpack_from(">I", entries)[0], 4, []
while (cursor < len(entries)):
magic, length = struct.unpack_from(">4sI", entries, cursor)
if magic != "Entr" or cursor + length + 8 > len(entries):
raise ValueError, "Invalid file format"
size, offset, number = struct.unpack_from(">4xI4x2I", entries, cursor + 8)
if version <= (1, 0, 1):
names = entries[cursor + 29:cursor + length + 8]
else:
names = entries[cursor + 30:cursor + length + 8]
names = names.rstrip('\x00').split('\x00')
item = { 'number': number, 'name': names[0], 'filename': names[1] }
if len(names) > 2:
item['depends'] = names[2:]
if data:
item['data'] = data[offset - 8:offset + size]
items.append(item)
cursor += length + 8
if len(items) != count:
raise ValueError, "Invalid file format"
else:
return items
def bankname(number):
bank, program = number >> 8, 1 + (number & 0xff)
if bank in range(0x3f08, 0x3f10):
return "USR%d:%03d" % (bank - 0x3f07, program)
if bank == 0x3f28:
return "USRDR:%03d" % program
if bank in range(0x3f80, 0x3fc0):
if program <= 128:
return "SNG%d:SP%03d" % (bank - 0x3f7f, program)
else:
return "SNG%d:MV%03d" % (bank - 0x3f7f, program - 128)
if bank in range(0x3fc0, 0x4000):
if program <= 128:
return "PTN%d:SP%03d" % (bank - 0x3fbf, program)
else:
return "PTN%d:MV%03d" % (bank - 0x3fbf, program - 128)
return "0x%06x" % number
version, blocks = input()
for block in blocks.keys():
if block[0] != 'E':
continue
for item in unpack(version, blocks[block], None):
if block == 'EVCE':
print "VCE %s %s" % (bankname(item['number']), item['name'])
else:
print "%3s %04d %s" % (block[1:], item['number'] + 1, item['name'])
#!/bin/python
import struct
import sys
def input(file = sys.stdin, types = None):
data = file.read(64)
if len(data) != 64 or data[0:16].rstrip('\x00') != "YAMAHA-YSFC" \
or data[36:64] != 28 * '\xff':
raise ValueError, "Invalid file format"
try:
version = tuple(map(int, data[16:32].rstrip('\x00').split('.')))
size = struct.unpack(">I", data[32:36])[0]
catalog = file.read(size)
assert len(version) == 3 and len(catalog) == size
except:
raise ValueError, "Truncated file"
blocks = {}
while True:
name = file.read(4)
if not name:
return version, blocks
elif len(name) != 4:
raise ValueError, "Truncated file"
elif not name.isalpha() or not name.isupper():
raise ValueError, "Invalid file format"
try:
size = struct.unpack(">I", file.read(4))[0]
if not types or name in types:
blocks[name] = file.read(size)
assert len(blocks[name]) == size
else:
file.seek(size, 1)
except:
raise ValueError, "Truncated file"
def output(version, blocks, file = sys.stdout):
file.write(struct.pack("16s", "YAMAHA-YSFC"))
file.write(struct.pack("16s", '.'.join(map(str, version))))
file.write(struct.pack(">I", 8*len(blocks)) + 28*'\xff')
names = blocks.keys()
names.sort(key = lambda n: '0' + n if n[0] == 'E' else '1' + n)
offset = 64 + 8*len(names)
for name in names:
file.write(struct.pack(">4sI", name, offset))
offset += 8 + len(blocks[name])
for name in names:
file.write(struct.pack(">4sI", name, len(blocks[name])))
file.write(blocks[name])
version, blocks = input()
del blocks['ESYS'], blocks['DSYS']
output(version, blocks)
Owner

arachsys commented Jun 13, 2012

See https://gist.github.com/2919521 for the mapping between the Motif XF (version 1.0.2) and Motif XS (version 1.0.1) preset waveform and arpeggio numbers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment