Skip to content

Instantly share code, notes, and snippets.

@btbytes
Last active June 1, 2021 23:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save btbytes/f0cc557efe2b18df21be to your computer and use it in GitHub Desktop.
Save btbytes/f0cc557efe2b18df21be to your computer and use it in GitHub Desktop.
ps2txt

PS2txt

Source: ftp://rohan.sdsu.edu/pub/unix/ps2txt.c

/* Jason Black, Feb 22 1992
Input can come from stdin, from '-' or from a file named on the command line:
Flags: -dvi for use with dvitps PostScript files.
Usage:
ps2txt [-dvi] [-] [input_file.ps]
ps2txt.c extracts strings from a PostScript file. This version has been
modified to correctly deal with the oddities of PostScript files generated by
dvi-to-PostScript converters, so if you keep this and the original program
around, you might want to rename one of them.
VERSION: 1.1 Fixed bug dealing with comments.
1.2 By popular demand: put spaces between strings.
2.0 Fixed most problems of extraneous spaces and newlines
between strings.
added support for the ligatures ff, fi, fl, ffi, & ffl.
re-designed the control structures, and otherwise cleaned
up the code.
2.1 Put Qazi's original algorithm back in, and added -dvi flag to
use my more specific algorithm. Also by popular demand.
Re-wrote the command line parsing yet again.
History: Modified Qazi's program on Feb. 18 1992 so that it could do dvitps
files well. Posted to alt.sources. Got feedback requesting support
for regular PostScript files as well. Retrieved Qazi's original
source code, and put it back in on Feb. 22 1992. While the original
program concept and source code is from Iqbal Qazi, this version
has had enough modifications that I am claiming it as my own. Qazi's
sections are well marked if you want to see them.
Comments/suggestions to cloister@u.washington.edu
*/
#include <stdio.h>
#define Putc(x) putchar(x); /* makes some lines not exceed 80 chars. */
#define TRUE 1
#define FALSE 0
void dviparse(); /* function prototypes */
void psparse();
void main(argc, argv)
int argc;
char *argv[];
{
int i, /* everybody's favorite counter */
known_flag, /* used during command line parsing */
dvi_file = FALSE; /* true if -dvi option found on command line */
FILE *file, *source; /* input stream */
source = stdin; /* default input source */
for(i=1; i<argc; i++) /* parse command line args */
{
known_flag = FALSE;
if (strcmp(argv[i],"-dvi") == 0) /* is it a dvitps PostScript file? */
{
dvi_file = TRUE;
known_flag = TRUE;
}
if (strcmp(argv[i],"-") == 0) /* weirdo-user explicitly wants stdin */
{
source = stdin;
known_flag = TRUE;
}
if (!known_flag) /* must be the input file name */
{
if ((file=fopen(argv[i],"r")) != NULL )
source=file;
else
{
fprintf(stderr,"ps2txt: error opening file %s\n",argv[i]);
fprintf(stderr,"usage: ps2txt [-dvi] [-] [input_file.ps]\n");
exit(1);
}
}
}
if (dvi_file)
dviparse(source); /* use my algorithm */
else
psparse(source); /* use Iqbal's algorithm */
}
void dviparse(source)
FILE *source;
{
int ch, /* current character */
prev_ch = '\n', /* previously read character */
in_paren = FALSE, /* inside or outside of parentheses? */
b_flag = FALSE, /* true if previous character was ')' */
b_space = TRUE; /* true if a 'b' should produce a space */
char junk[80]; /* place to throw away comment lines */
while ((ch = fgetc(source)) != EOF)
{
if (ch == '\n') ch = fgetc(source); /* ignore newlines in input! */
if (in_paren) /* strings to print come inside parentheses */
switch(ch)
{
case ')' : in_paren--; b_flag=1; break; /* not in paren's anymore */
case '\n' : Putc(' '); break; /* <cr> = ' ' in parens */
case '\\' :
switch(ch=fgetc(source))
{
case '(' :
case ')' : Putc(ch); break; /* from \? */
case 't' : Putc('\t'); break; /* write a tab */
case 'n' : Putc('\n'); break; /* write a <cr> */
case '\\': Putc('"'); break; /* open quotes */
case '0' : switch(ch=fgetc(source))
{
case '1': switch(ch=fgetc(source))
{
case '3' : fputs("ff",stdout); break; /* from \01? */
case '4' : fputs("fi",stdout); break;
case '5' : fputs("fl",stdout); break;
case '6' : fputs("ffi",stdout); break;
case '7' : fputs("ffl",stdout); break;
default: fputs("\\01",stdout); Putc(ch); /* unknown code */
} break; /* from \0? */
default: fputs("\\0",stdout); Putc(ch); /* unknown code */
} break;
case '1' : case '2' : case '3' : case '4' :
case '5' : case '6' : case '7' : Putc('\\'); /* unknown code */
default: Putc(ch);
} break; /* from original switch */
default: Putc(ch);
}
else /* not in paren's */
switch(ch)
{
case '%' : fgets(junk, 80, source); break; /* toss out comments */
case '\n' : break; /* skip <cr>'s outside of parens */
case '-' : if (b_flag)
{
b_flag = 0; /* because now prev. char != ')' */
b_space = 0; /* but the number after ')' is negative, so no */
/* space in case the letter code is 'b'. */
/* the default is b_space = 1 */
} break;
case '(' : in_paren++; /* back in parens again */
switch(prev_ch) /* check prev char to see if we need a space */
{
case 'l' : case 'm' : case 'n' : case 'o' : /* not for these 8 */
case 'q' : case 'r' : case 's' : case 't' :
break;
case 'y' : Putc('\n'); break; /* need a newline */
case 'b' : if (b_space) Putc(' '); break; /* 'b' w/ a + number */
case 'a' : case 'c' : case 'd' : case 'e' :
case 'f' : case 'g' : case 'h' : case 'i' :
case 'j' : case 'k' : case 'x' : Putc(' '); break;
default: break;
}
b_space = 1; /* reset flag to default for next time */
break;
default: b_flag = 0; break; /* junk stuff not in parens */
}
prev_ch=ch; /* remember this char in case !in_paren and next ch = '(' */
}
}
void psparse(source) /* Iqbal's original uncommented program, unmodified */
FILE *source; /* except for stripping i/o stuff off the top, etc: */
{
char *str;
char junk[80];
int ch, para=0, last=0;
while ((ch=fgetc(source)) != EOF)
{
switch (ch)
{
case '%' : if (para==0) fgets(junk, 80, source);
else putchar(ch);
case '\n' : if (last==1) { puts(""); last=0; } break;
case '(' : if (para++>0) putchar(ch); break;
case ')' : if (para-->1) putchar(ch);
else putchar(' ');
last=1; break;
case '\\' : if (para>0)
switch(ch=fgetc(source))
{
case '(' :
case ')' : putchar(ch); break;
case 't' : putchar('\t'); break;
case 'n' : putchar('\n'); break;
case '\\': putchar('\\'); break;
case '0' : case '1' : case '2' : case '3' :
case '4' : case '5' : case '6' : case '7' :
putchar('\\');
default: putchar(ch); break;
}
break;
default: if (para>0) putchar(ch);
}
}
}
@prashant-shahi
Copy link

hmm...

@wujoho
Copy link

wujoho commented Jun 1, 2021

I compiled the above code into .exe file in Windows 10, and run the below command but end with error, is there syntax error in my below command line? or is this a PS2txt bug?

c:\Users\John\Desktop>ps2txt -dvi - C:\PrinterPlusPlus\Temp\QRCodePrinter_192.168.0.252_John_20210531_173253_3.ps
---- -mark- -dictionary- -null- -filestream- -savelevel- -fontid- /()-string- {}[]-array- {}[]-packedarray- VMerror VMerrorERROR: OFFENDING COMMAND: STACK: %%[ Error: ; OffendingCommand: ]%%This job requires more memory than is available in this printer.Try one or more of the following, and then print again:For the output format, choose Optimize For Portability.In the Device Settings page, make sure the Available PostScript Memory is accurate.Reduce the number of fonts in the document.Print the document in parts. %%[ PrinterError: Low Printer VM ]%%DefaultColorRendering* . .%%[ ProductName: ]%%glyxlocx%%[Page: 1]%%%%[LastPage]%%
c:\Users\John\Desktop>ps2txt C:\PrinterPlusPlus\Temp\QRCodePrinter_192.168.0.252_John_20210531_173253_3.ps


-mark-
-dictionary- -null-
-filestream- -savelevel-
-fontid-
/ (
) -string-
{ } [ ]
-array-
{ } [
] -packedarray-
VMerror
VMerror
ERROR: OFFENDING COMMAND:
STACK:
%%[ Error:
; OffendingCommand: ]%
%
This job requires more memory than is available in this printer.
Try one or more of the following, and then print again:
For the output format, choose Optimize For Portability.
In the Device Settings page, make sure the Available PostScript Memory is accurate.
Reduce the number of fonts in the document.
Print the document in parts.
%%[ PrinterError: Low Printer VM ]%%
DefaultColorRendering*
.
.
%%[ ProductName: ]%
%
glyx locx

%%[Page: 1]%%
%%[LastPage]%%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment