The software package bangtex
was developed for writing
Bengali/Assamese in TeX/LaTeX format. Insofar as a user is
interested in producing only TeX/LaTeX output in the form of
a ps or pdf file, bangtex is sufficient and this program,
sei2uni
, is irrelevant.
The utility of this program is in making Unicode Bengali text
from the files created for bangtex. This is useful because
these days, many publishers want a Unicode file along with the
pdf file, which helps their typesetting and pagemaking.
Once the .tex
file is produced, written in the bangtex
format, this program produces a .txt
file with Unicode Bengali in it.
After the development of original bangtex, some supporting
softwares were developed which are front-ends to make the input
process easier and faster. With these softwares, one first needs
to write a file, and then the software needs to be applied on it
to produce the .tex
file that can be processed by TeX/LaTeX.
One such software is seicor,
developed by Somendra Mohan Bhattacharjee.
If one creates a preliminary file for seicor to run on it, one can
also directly apply the present program, sei2uni, on it to
produce the Unicode file. In other words, in this case it is not
even necessary to create the .tex
file if the final interest
lies in the Unicode file.
Thus, there are the following options for producing the Unicode file.
This program, sei2uni, produces a Unicode .txt
file from the
files used in seicor. This means that, once one produces the seicor
file (extension _sei.tex
), one can use it two ways:
-
Create an almost-phonetic bangtex file which uses commands to put certain vowel symbols before the consonants with which they are joined, like
\*b*i\*d*esh
for printing outবিদেশ
in the output. No matter how this file is produced, run sei2uni on it to obtain the Unicode file. -
Create a file that can be transformed to the
.tex
file by using seicor. On this file, one can apply sei2uni to obtain the Unicode file.
To run this script, you need perl to be installed. It has been tested with perl v5.16.3, v5.26.1 and v5.30.0.
perl sei2uni.pl [options] input_file
Alternatively, if one makes sei2uni.pl
an executable
file, then one can use
sei2uni.pl [options] input_file
using the proper path to the file sei2uni.pl
, or
by including its location in the list of default paths.
-
-k
,--keep-rm
: Keeps the\rm
tags and their associated braces in the output.txt
file. The default is to remove these tags and braces. -
-o
,--output-file
: Name of output file. Defaults to*.txt
for an input file named*_sei.tex
, otherwise touni_out.txt
. -
-p
,--placeholder
: Only used internally, default is^#
and should be set to any string or character that does not appear in the input.
The output of the program will be a .txt
file,
whose name will be determined by the default, or by the user's
specification, as described above.
This .txt file will contain all Bengali text from the sei file in
Unicode characters. It will not make any change in the following
parts of the sei file:
-
Any Tex/LaTeX command starting with a backslash. The inactivity region will continue until the program finds a blank space or a linebreak in the sei file.
-
Any text intended to appear in the Roman font, announced by
\rm
. These announcements must appear in one of the following formats in the sei file:\rm{ABCD}
{\rm{ABCD}}
{\rm ABCD PQRS}
where the capital letters indicate the presence of anything.
-
Everything in math mode, provided math mode is opened and closed by the
$
sign.
Here is an example of a short sei file and the .txt file produced
after applying sei2uni.pl
on it.
Input bangtex file | Output Unicode file |
---|---|
\documentclass{barticle} |
\documentclass{barticle} |
-
There is no unique way of writing the ASCII sei file. For example, if one wants to produce the Bengali text
ওই
, one can useOoI
in the input file so that theO
and theI
do not join in a ligature to giveঐ
in the output. But the same effect can be achieved by typingO{I}
or{O}I
.sei2uni.pl
works only on the first alternative,OoI
. In other alternatives, the braces will be visible in the output. -
The
sei2uni.pl
converter is supposed to convert the text. It does not understand the Tex/LaTeX commands. So, for example, if there is a command for creating a table, the.txt
file will not come out with a table. The same applies for any formatting command, like figure, or equation.