Post by Bence FábiánI did a quick writeup on little Edit scripts
Many thanks, this thread is very useful.
There is also Jason Catena's list of Edit idioms at
https://raw.github.com/catenate/acme-fonts/master/test/1/acme/Edit/sam
When editing and re-editing latex, I regularly pipe selections
through a simple-minded script called `chunk' which does most of
the work for obtaining semantic linebreaks. That goes back to a
recommendation by Kernighan in his paper `Unix for beginners' of
1974; see the quotation, comments and link at [1].
#!/usr/local/plan9/bin/rc
# chunk up (to prepare) for semantic linebreaks
# do not break within \cite
# do not break within $$ math
# break after closing parentheses ),]
# break before an opening parentheses (,[
ssam -e 'x/(^[^%].+\n)+/ y/\\cite[^{]*{(\n|.)*}/ y/\$.*\$/
x/(([^A-Z]\.)|[,;:!?]|\)|\]) | (\(|\[)/ s/ /\n/' \ | 9 fmt -w 60
-j
For batch processing probably something more sophisticated would
be needed to leave various environments unchunked. But I don't use
it that way, and just apply it to selections where I know its use
makes sense. Usually these are areas where I have just been doing
a lot of rewriting.
There's no point in chunking up commented material, and sometimes
it is actually convenient to have a place where I can keep things
unchunked for reference.
The original chunk command in Writer's Workbench [2], for troff not
latex, was based on a parser for English, I think. I find I don't
want that (because I write in other languages as well), and that
even in English I don't need it (because the chunking based on
interpunction is always fine with me, and where I care about the
remaining cases, I prefer to do it myself; but see [3]).
Mark.
[1] http://rhodesmill.org/brandon/2012/one-sentence-per-line/
[2] http://man.cat-v.org/unix_WWB/1/chunk
[3] https://github.com/waldir/semantic-linebreaker