This directory contains a small collection of files that might be of interest for people working with ISO character sets. It is mirrored on . You might also be interested in the following Internet locations: http://www.dkuug.dk/JTC1/SC2/ http://www.unicode.org/ ftp://unicode.org/pub/ http://www.ecma.ch/ http://www.indigo.ie/egt/standards/ http://www.blueneptune.com/~tseng/Unicode/Unicode2.0.html http://www.kostis.net/charsets/ http://czyborra.com/charsets/ http://www.itscj.ipsj.or.jp/ISO-IR/ Especially have a look at if you need Unicode to and from anything mapping tables. Markus -- Markus Kuhn, Computer Science student -- University of Erlangen, Internet Mail: - Germany WWW Home: ------------------------------------------------------------------------- INDEX This file. ISO-10646-UTF-8.html ISO-10646-UTF-16.html An unofficial HTML version of the final draft text for the new ISO/IEC 10646-1 extensions that specify the UTF-8 and UTF-16 encodings (see also ). ISO-10646-summary A few facts about the ISO 10646 character set (UCS) that has been designed as the 'final' character set incorporating all languages on this world and that is also known as Unicode. (see also utf-8.c and ucs-list.gz) ISO-8859-1-table.ps.Z A 300dpi scanned character set table of the Latin alphabet No. 1 defined in ISO 8859-1. UTF-8-Plan9-paper.ps.gz A paper from the January 1993 Usenix Proceedings about the usage of the UTF-8 encoding of ISO 10646 in the Plan 9 operating system. Recomended reading! See also file utf-8.c and ISO-10646-UTF-8.html. cde14651.pdf ISO CD 14651 is a draft sorting standard for Unicode strings. charset-standards-groups charset-standards-list A list of currently published character set standards and of committees working on character set standardization. cpi120.zip isocp101.zip isokb100.zip This is software developed by Kosta Kostis that allows you to use ISO 8859 codepages with MS-DOS. The ISOCP package contains the files a normal user needs, CPI contains special tools for editing MS-DOS *.CPI files and ISOKB contains the source codes of the keyboard control programs in the ISOCP package. CPI and ISOKB are for experts only. draft-yergeau-utf8-00.txt Internet-Draft, describing UTF-8 and specifing a MIME character set identifier for UTF-8. everson-mono-ucs-font An announcement for a new free full ISO 10646 font and were it can be downloaded. iso2asc.c A highly portable and easy to install sophisticated ISO 8859-1 to ASCII text file converter. Has options for differnt languages and doesn't destroy tables like many more primitive converters. Manual is printed if iso2asc is called without command line options. iso2asc.txt The theory behind iso2asc.c. Read this if you want to include the algorithm in your application! isofont101.tar.gz IBM PC VGA fonts for ISO 8859 (for DOS users, see also isocpi*.zip). konvers-911.tar.gz EMACS lisp functions for converting ISO 8859-1 to TeX style 7-bit representation by Karl Brodowsky. mimeb150.zip mimes150.zip These are patches (binary and sources) for MS-DOS Waffle that allow you to use ISO 8859-1 and MIME. rfc1641.ps.gz rfc1641.txt Using Unicode with MIME. rfc1642.ps.gz rfc1642.txt A Mail-Safe Transformation Format of Unicode (UTF-7). tcs.tar.gz The Plan 9 character set converter. trans113.tar.gz Kosta Kostis' character set converter toolkit. This package contains nice tables of many common character sets (ISO and vendor specific) which list the ISO 10646 number for each character. These tables are used to create conversion tables from any character set to any other one. ucs-input-methods A draft for a new ISO standard which specifies standard keyboard input methods for ISO 10646 characters. (probably outdated) ucs-list.gz A text file with a table that lists the names of all ISO 10646 and Unicode 1.1 characters. One column in this table contains the character encoded in UTF-8 (see utf-8.c for details). ucs-map-cp1252 ISO 10646 code mapping for the Microsoft Windows Latin1 character set (a superset of ISO 8859-1 with additional characters in the 0x80-0x9f range). ucs-map-cp437 ISO 10646 code mapping for the IBM PC and classic MS-DOS character set. ucs-map-cp850 ISO 10646 code mapping for the modern MS-DOS and OS/2 Latin1 character set. ucs-map-mac ISO 10646 code mapping for the Apple Macintosh roman character set. utf-1.c C routines for handling the ISO 10646 UTF-1 coding. This encoding is defined in an appendix of ISO 10646, but is already considered obsolete. Better have a look at UTF-8 (see file utf-8.c). utf-8.c <- READ THIS IF YOU HAVE NEVER HEARD ABOUT UTF-8!!! C routines and specification for the ISO 10646 UTF-8 encoding for UNIX systems (old names for the same encoding are FSS-UTF and UTF-2). This is very likle to be the way ISO 10646 will be used on UNIX systems in the near future. See also file ISO-10646-UTF-8.html and UTF-8-Plan9-paper.ps.gz.