I was struck by Norm Walsh’s essay Goodbye DTDs, in which he talks of going to an all-RelaxNG environment, no more DTDs. Within seconds of seeing it I IM’d him asking “What about special characters?” and he pointed out that there would still be some entity declarations around. ongoing has a DTD too, but I’d rather it didn’t, so I decided to see if I could wrestle Emacs to the ground so I wouldn’t need one. Of possible interest only to the eleven people in the world who edit XML in Emacs and know what “i18n” stands for. [Updated; skip to the end for a neato char-insertion function.]
The problem is, not only do I regularly want to use non-ASCII characters like ½ and ä, I want “smart quotes” like those you see around the first instance of "smart quotes", not dumb quotes like those around the second. Also real apostrophes (’ not ').
I think the Mac has input methods that let you type these things in, but they don’t seem to play nice with Emacs.
Of course, you can always enter ä as ä
, but
that kind of sucks, so most bound-to-ASCII people like to use something like
ä
which is helpfully built-in to HTML, except for if
you’re in XML-land it’s not built-in so you have to have somewhere
to declare it, which is a DTD.
This is why there’s a file in the production system here called
ongoing.dtd
that gets stuck on the front of each of these notes
as it’s rendered for the Web.
As a side-effect of going through the XML machinery, what gets published has
none of this, just the naked UTF-8 characters.
Anyhow, dammit, Emacs is modern software and ought to give me some way to type in and view UTF-8 characters directly, few fonts are so primitive as to be missing smart quotes and so on. I got it working eventually, here’s how:
Language Environment, Huh? · Emacs has its own idea of how to store characters, which I haven’t really figured out yet but fortunately you don’t have to, because you can force it to use UTF-8 when it saves, like so:
(set-language-environment "UTF-8")
I just put that in my .emacs
, but you could put it in a
special XML-editing ghetto if you wanted.
Now it seems to me that “language environment” is not really a category into which I’d sort the string “UTF-8” but whatever.
Font Pain · Emacs has this arcane structure of fonts and faces that, once again, I’ve never really figured out, but fortunately, once again, you don’t have to, you can just use a stone axe, all you have to do is to figure out the X11-i-fied name of a font with some basic Unicode moxie. This is what works under OS X:
(set-default-font "-etl-fixed-medium-r-normal-*-16-*-*-*-*-*-fontset-mac")
Jamming In the Bytes ·
The magic function here is ucs-insert
, which of course works
quite differently in interactive and background modes.
The real hackers can stop here, because they’ll all now have ideas how to
weave ucs-insert
into their own lifestyles.
I have hardwired keybindings for smart quotes, for the rest of them I offer
the following solution; my elisp is rusty and far from idiomatic but it
should give the idea.
The elisp function x-popup-menu
, by the way, generally sucks,
but whatever.
(defvar ongoing-char-choice
'("Special characters"
(""
("ccedil" #xe7)
("copyright" #xa9)
("degree" #xb0)
("dot" #xb7)
("eacute" #xe9)
("half" "½")
("omacr" "ō")
("oouml" #xe4)
("uuml" #xfc)
("euro" #x20ac)
("cents" #xa2)
("egrave" #xe8)
("lsquo" #x2018)
("rsquo" #x2019)
("ldquo" #x201c)
("rdquo" #x201d)
("mdash" #x2014))))
(defun ong-special-chars-menu ()
"Insert a special character from a menu"
(interactive)
(let ((value
(car (x-popup-menu
(list '(10 10) (selected-window))
ongoing-char-choice))))
(cond
((integerp value) (ucs-insert value))
((stringp value) (insert value))
('t )))) ;; so you can hit escape and make the menu go away
Bind ong-special-chars-menu
to some handy key and you’re
cooking. I could have put the unicode characters themselves in the
left-hand column but (on OS X at least) whatever displays menus is stuck
firmly in 8859-land, thus the names.
The half
and omacr
characters are given as NCRs
not literals because
they’re not in the font I edit in—but that’s OK, they don’t
show up that often.
Now the source-code of ongoing that I edit is ever so much prettier.
Easy Insertion of Commonly-Used Special Characters · I was worrying about how to make the job of inserting the special characters that I use all the time easier. These are most commonly the smart quotes and apostrophes, occasionally an em-dash and so on. It dawned on me that I now never need to type an old-fashioned apostrophe any more, so I bound that key to the following function. So you type apostrophe twice to get a smart apostrophe, you type apostrophe-S to open or close single quotes, apostrophe-D for double quotes, apostrophe-dash for mdash, plus a couple of other handy little things:
(defun one-quote () "" (interactive) (insert ?'))
(defvar sq-state 'nil "In single-quotes?")
(defvar dq-state 'nil "In double quotes?")
(defun ong-insert-special (c) "Insert special characters, like so:
s => open/close single quotes
d => open/close double quotes
' => apostrophe
a => <a href=
i => <img src=
& => &
< => <
- => mdash
. => center-dot"
(interactive "c" "'")
(cond
((= c ?s)
(if sq-state
(progn
(ucs-insert #x2019)
(setq sq-state 'nil))
(ucs-insert #x2018)
(setq sq-state 't)))
((= c ?d)
(if dq-state
(progn
(ucs-insert #x201d)
(setq dq-state 'nil))
(ucs-insert #x201c)
(setq dq-state 't)))
((= c ?') (ucs-insert #x2019))
((= c ?a)
(progn
(if (> (current-column) 0) (newline-and-indent))
(insert "<a href=\"\">")
(backward-char 2)
))
((= c ?i)
(progn
(if (> (current-column) 0) (newline-and-indent))
(insert "<img src=\"\" alt=\"\" />")
(backward-char 11)
))
((= c ?&) (insert "&"))
((= c ?<) (insert "<"))
((= c ?-) (ucs-insert #x2014))
((= c ?.) (ucs-insert #xb7))))