set-char-mask - Set character word mask
n set-char-mask "flags" ["value"]
set-char-mask returns or modifies the setting of MicroEmacs internal character tables. The argument n defines the action to be taken, as follows:-
The first argument "flags" determines the required character set as follows:-
Unlike other sets, this set cannot be incrementally altered, any calls to alter this set leads to the resetting of all the character tables so the character mapping must be performed first and in a single call. No other set may be altered in the same call. When setting, the "value" must supply pairs of characters, an ISO-8859-1 character followed by its system font equivalent.
Note that the returned character list will pair all lower-case characters with their upper-case equivalent letters first.
1, 2, 3 & 4
As with flag M, this cannot be incrementally altered, any call to set this mapping first resets the mapping table so the mapping must be performed in a single call. No other set may be altered in the same call. When setting, the "value" must supply pairs of characters, the ISO-8859 non-latin character followed by its latin character mapping.
Unless stated otherwise, multiple flags may be specified at the same time returning a combined character set or setting multiple properties for the given "value" characters.
For many UNIX XTerm fonts the best characters to use for $box-chars(5) (used in drawing osd(2) dialogs) lie in the range 0x0B to 0x19. For example the vertical bar is '\x19', the top left hand corner is '\x0D' etc. These characters are by default set to be not displayable or pokable which renders them useless. They can be made displayable and pokable as follows:-
set-char-mask "dp" "\x19\x0D\x0C\x0E\x0B\x18\x15\x0F\x16\x17\x12"
MicroEmacs variables have either '$', '#', '%', ':' or a '.' character prepended to their name, they may also contain a '-' character in the body of their name. It is preferable for these characters to be part of the variable 'word' so commands like forward-kill-word(2) can work correctly. This may be achieved by adding these characters to user set 2 and setting the buffer-mask variable to include set 2, as follows:
set-char-mask "2" "$#%:.-" define-macro fhook-emf set-variable $buffer-mask "luh2" . . !emacro
For the examples below only the following subset of characters will be used:-
Character ISO-8859-1 Windows OEM PC Page 437 Capital A (A) A A A Capital A grave (`A) \xC0 \xB7 No equivalent Capital A acute ('A) \xC1 \x90 No equivalent Small a (a) a a a Small A grave (`a) \xE0 \x85 \x85 Small A acute ('a) \xE1 \xA0 \xA0
As the spell checker only operates in ISO-8859-1 (Latin 1), the character font mapping (flag M) must be correctly setup for spell checking to operate correctly. For ISO-8859-1 (ISO) this is an empty string as the default mapping is correct, but for both Windows OEM (OEM) and PC Code Page 437 (PC-437) the mappings should be set as follows:-
; OEM font mapping setup set-char-mask "M" "\xC0\xB7\xC1\x90\xE0\x85\xE1\xA0" ; PC-437 font mapping setup set-char-mask "M" "\xC0A\xC1AAA\xE0\x85\xE1\xA0"
As all the characters in ISO have equivalents in OEM, the mapping for OEM is a simple ISO to OEM character list. However the missing capital A's in PC-437 cause problems, for the command charset-iso-to-user(3) it is preferable for a mapping of `A to be given, otherwise the document being converted may remain unreadable. Therefore a mapping of `A to A is given to alleviate this problem, similarly 'A is also mapped to A.
This leads to a similar problem with the conversion of PC-437 back to ISO (the operation of command charset-user-to-iso(3)). If only the mapping of "\xC0A\xC1A" was given, the last mapping ('A to A) would also be the back conversion for A, i.e. ALL A's would be converted back to 'A's. To solve this problem, a further seemingly pointless mapping of A to A is given to correct the back conversion.
For languages which use these characters, the alphabetic character set must be extended to include these characters for letter based commands like forward-word(2) and upper-case-word(2) to operate correctly. The addition of extra letters must achieve two goals, firstly to define whether a character is a letter, enabling commands like forward-word to work correctly. The second is to provide an upper case to lower case character mapping, enabling commands like upper-case-word to work correctly. This is achieved with a single call to set-char-mask using the a flag as follows:-
set-char-mask "a" "\xC0\xE0\xC1\xE1"
Note that this flag always expects a ISO-8859-1 character, this allows the same map character list to be used regardless of the font set being used, i.e. the above line can be used for ISO, OEM and PC-437 fonts. But it does mean that the ISO to user font character mapping (flag M) must already have been performed.
Similar problems are encountered with the M flag with font PC-437. This problem is not immediately obvious because the mapping is given in ISO, but when this is converted to PC-437, the mapping string becomes "A\x85A\xA0". As can be seen, A is mapped last to 'a so an upper to lower character operation will convert a A to 'a. A similar solution is used, a further mapping of A to a is given to correct the default case mapping for both A and a, i.e. the following line should always be used instead:-
set-char-mask "a" "\xC0\xE0\xC1\xE1Aa"
Copyright (c) 1998-2006 JASSPA
Last Modified: 2005/01/15
Generated On: 2006/10/07