| charsets {tools} | R Documentation |
charset_to_Unicode is a matrix of Unicode points with columns
for the common 8-bit encodings.
Adobe_glyphs is a dataframe which gives Adobe glyph names for
Unicode points. It has two character columns, "adobe" and
"unicode" (a 4-digit hex representation).
charset_to_Unicode Adobe_glyphs
charset_to_Unicode is an integer matrix of class
c("noquote", "hexmode") so prints in hexadecimal.
The mappings are those used by libiconv: there are differences
in the way quotes and minus/hyphen are mapped between sources (and the
postscript encoding files use a different mapping).
Adobe_glyphs include all the Adobe glyph names which correspond to
single Unicode characters. It is sorted by Unicode point and within a
point alphabetically on the glyph(there can be more than one name for
a Unicode point). The data are in the file
‘R_HOME/share/encodings/Adobe_glyphlist’.
http://partners.adobe.com/public/developer/en/opentype/glyphlist.txt
## find Adobe names for ISOLatin2 chars.
latin2 <- charset_to_Unicode[, "ISOLatin2"]
aUnicode <- as.numeric(paste("0x", Adobe_glyphs$unicode, sep=""))
keep <- aUnicode %in% latin2
aUnicode <- aUnicode[keep]
aAdobe <- Adobe_glyphs[keep, 1]
## first match
aLatin2 <- aAdobe[match(latin2, aUnicode)]
## all matches
bLatin2 <- lapply(1:256, function(x) aAdobe[aUnicode == latin2[x]])
format(bLatin2, justify="none")