Skip to content

Instantly share code, notes, and snippets.

@thinkhy
Created July 7, 2011 08:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thinkhy/1069127 to your computer and use it in GitHub Desktop.
Save thinkhy/1069127 to your computer and use it in GitHub Desktop.
Extract MathML from GIF image generated by MathType(Updated)
# Reference: http://stackoverflow.com/questions/6599781/how-can-i-convert-mathtype-equation-into-mathml-format
# Thanks afwings
################################################################################################
# GIF Image Files
#
# MathML text is embedded into a GIF file as an Application Extension Record,
# which consists of a 14-byte header (Application Extension Descriptor),
# followed by the MTEF data. The header contains:
#
# Byte Introducer = 0x21;
# Byte ExtensionLabel = 0xFF;
# Byte BlockSize = 0x0B;
# Byte ApplicationId[8] = "MathType";
# Byte AuthenticationCode[3] = "003";
#
# The data follows this header and is written as a series of blocks each containing 255 bytes or less.
# Each block starts with a single byte count followed by the data.
# The end is marked as a block with length 0.
#
# The header is unique enough that the easiest way to extract the data might be to
# scan the file for the 14-byte header, then expect the MathML data blocks to follow.
# Properly decoding the GIF records isn't that hard either, but obviously requires
# you read the GIF specification.
#
################################################################################################
my $math = ReadFromFile($inputFile);
my $mathTypeId = "MathType003";
($math =~ m/\x{21}\x{FF}(\C)$mathTypeId(.*)/gs);
my $blockSize = ord($1);
my $remain = $2;
length($mathTypeId) == $blockSize || print "Block size unmatched!\n";
$blockSize = ord((split //, $remain, 1)[0]);
my $result = "";
while($blockSize != 0)
{
if (!($remain =~ m/\C(\C{$blockSize})(.*)/sg))
{
print "\nBlock size is NOT correct.\n";
last;
}
$result .= $1;
$remain = $2;
$blockSize = ord((split //, $remain, 1)[0]);
}
print "RESULT: $result\n";
Copy link

ghost commented Jun 14, 2017

Hi! I hope you receive this message - I am new to Perl, but desperately need this little script you have written. I am running Strawberry Perl on my Windows PC. When I attempt to run your script, however, I receive the following error: "Undefined subroutine &main::ReadFromFile called at extractMathMLfromGIF.pl line 28" Sorry for such a noob question, but what am I missing here? Please contact me at dans@thinkwell.com. Thanks! - Dan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment