Internationalization

Steve Atkin

Florida Tech

Last updated 7/20/2000

Internationalization in Haskell

The Hugs98 implementation of Haskell 98 does not contain direct support for Unicode characters and strings. Here is a module Unicode.hs for converting Haskell strings to 16 bit integer lists (Unicode) or UTF-8 strings.

Haskell Bidirectional Algorithm - HaBi

These modules implement the Unicode Bidirectional Algorithm in Haskell. See the paper "Implementations of Bidirectional Algorithms" for a more detailed explanation ps pdf.

HaBi Modules
Purpose Module
Unicode converter Unicode.hs
Character attribute table Attributes.hs
Character mirroring table Mirror.hs
Character mapping table Charmap.hs
Bidi reordering UniBidi.hs
Test (stdio) Test.hs
Test (file io) TestDriver.hs

Character Mapping Rules
Type Arabic Hebrew Mixed English
L a - z a - z a - z a - z
AL A - Z A - M
R A - Z N - Z
AN 0 - 9 5 - 9
EN 0 - 9 0 - 4 0 - 9
LRE [ [ [ [
LRO { { { {
RLE ] ] ] ]
RLO } } } }
PDF ^ ^ ^ ^
NSM ~ ~ ~ ~

Sample Usage

The module Test.hs is used for testing HaBi by entering characters from the keyboard. This test case will loop until an empty line is seen. The format is as follows: runhugs Test [character map]. For example, at a prompt enter runhugs Test arabic. This will use the Arabic character map for determining the characters types.

The module TestDriver.hs takes input from a file. Comments may appear in a file "--". The format is as follows: runhugs TestDriver [file in] [file out] [character map]. For example, at a prompt enter runhugs TestDriver weak.in weak.out mixed. This will read in the file weak.in using the mixed character mapping rules while writing the reordered stream to weak.out.

Internationalization Bibliography

Home