Created
July 7, 2011 08:55
-
-
Save thinkhy/1069127 to your computer and use it in GitHub Desktop.
Extract MathML from GIF image generated by MathType(Updated)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Reference: http://stackoverflow.com/questions/6599781/how-can-i-convert-mathtype-equation-into-mathml-format | |
# Thanks afwings | |
################################################################################################ | |
# GIF Image Files | |
# | |
# MathML text is embedded into a GIF file as an Application Extension Record, | |
# which consists of a 14-byte header (Application Extension Descriptor), | |
# followed by the MTEF data. The header contains: | |
# | |
# Byte Introducer = 0x21; | |
# Byte ExtensionLabel = 0xFF; | |
# Byte BlockSize = 0x0B; | |
# Byte ApplicationId[8] = "MathType"; | |
# Byte AuthenticationCode[3] = "003"; | |
# | |
# The data follows this header and is written as a series of blocks each containing 255 bytes or less. | |
# Each block starts with a single byte count followed by the data. | |
# The end is marked as a block with length 0. | |
# | |
# The header is unique enough that the easiest way to extract the data might be to | |
# scan the file for the 14-byte header, then expect the MathML data blocks to follow. | |
# Properly decoding the GIF records isn't that hard either, but obviously requires | |
# you read the GIF specification. | |
# | |
################################################################################################ | |
my $math = ReadFromFile($inputFile); | |
my $mathTypeId = "MathType003"; | |
($math =~ m/\x{21}\x{FF}(\C)$mathTypeId(.*)/gs); | |
my $blockSize = ord($1); | |
my $remain = $2; | |
length($mathTypeId) == $blockSize || print "Block size unmatched!\n"; | |
$blockSize = ord((split //, $remain, 1)[0]); | |
my $result = ""; | |
while($blockSize != 0) | |
{ | |
if (!($remain =~ m/\C(\C{$blockSize})(.*)/sg)) | |
{ | |
print "\nBlock size is NOT correct.\n"; | |
last; | |
} | |
$result .= $1; | |
$remain = $2; | |
$blockSize = ord((split //, $remain, 1)[0]); | |
} | |
print "RESULT: $result\n"; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi! I hope you receive this message - I am new to Perl, but desperately need this little script you have written. I am running Strawberry Perl on my Windows PC. When I attempt to run your script, however, I receive the following error: "Undefined subroutine &main::ReadFromFile called at extractMathMLfromGIF.pl line 28" Sorry for such a noob question, but what am I missing here? Please contact me at dans@thinkwell.com. Thanks! - Dan