This is the first attempt to prepare searchable PDF with Devanagari text using the devnag package. Searchability is achieved by
adding ToUnicode maps using
cmap.sty. It means that this feature is available with
pdflatex only. Users of pdfplain will have to find their own way by examining
The images below show the results of simple searches made by Acrobat 7.0 in Linux Fedora Core 2. A large screenshot will open in its own window by clicking on the preview.
|Demonstration of search capabilities|
|Search for mai.mne||Search for utsaah|
Searching Devanagari texts is not that simple as searching texts in latin alphabets. The first peculiarity is presented with i-matra which precedes the consonant. Thus you are not able to find words as mis or dillii if you type them into the search fields as usual. It is necessary to put the characters in the same order as they are displayed. The Devanagari keyboard allows it.
Since the way, how the characters are combined to form words, is quite complex, Acrobat Reader
is often confused and adds word boundaries in the middle of a word. This is a case of u-matra and
uu-matra. You therefore cannot find kulluu, you must enter it as two words, i.e.
ku lluu. Similarly duur must be entered as duu followed by a space
and r. Word huu~m can be found if you type huu followed by a space and a
lonely candrabindu. This rule does not apply to words containing ru and ruu. These
sylables are contained in the
dvng fonts as special glyphs, therefore they do not
create word boundaries.
Words with vattu present other difficulties. There is no problem if the consonant with a vattu is available as a glyph. If it is composed from pieces, it again generates a word boundary. When searching for .draaivar you must enter .dr followed by a space and aaivar. Unicode distinguishes matras and independent vowels. The Devanagari keyboard thus allows you to do what is impossible with Velthuis transliteration: start the word with aa-matra followed by independent i.
dvng fonts do not contain independent long a. It is composed from short
independent a followed by aa-matra. You must keep this in mind when searching for
aayaa. The Devanagari keyboard allows you to write it. Independent o and au
are formed similarly. As a complex example try to find word aak.rti. First you must start
with short independent a followed by aa-matra. The r-matra creates a word boundary, therefore
a space must be inserted. Finally, i-matra must precede the t-consonant.
Unfortunately I did not succeed to find word huaa.
Anusvaras and candrabindus do not create word boundaries. Words as mai.mne and jaauu~mgaa are perfectly searchable.
The second test file,
examples.dn, demonstrates that you can find both variants of
kta no mater whether the ligature is available from your system font. Try to search for
yu ktatara.m and do not forget to add a space after u-matra (see above).
Superscript repha acts similarly to the i-matra. It is written at the end of the akshara, so you have to put it there when filling in the search dialogue. When trying to search for munibhirmata.m, mu inibhmarta.m must be entered. Similarly durj~neya.m can be found as du j~nerya.m.
You can download a sample searchable file. It is the samle file from the devnag package with added date so that I could verify
searching digits. The tools are available from a single file
dvngpdf.zip together with a
short installation instruction and the
*.dn sources for making the sample PDF