Discussion:
[9fans] awk split(a[1], a)
(too old to reply)
c***@gmx.de
2012-12-10 07:13:11 UTC
Permalink
theres a bug in awk's split function. split depopules
the array a, accidently freeing a[1] (y). this doesnt show
up with ape's memory allocator but with plan9's pool
allocator or by modifying freesymtab() to XXX out
the strings before freeing one gets surprising results
like:

awk 'BEGIN{a[1]="a b";print split(a[1],a),a[1],a[2]}'
1 XXX

diff -r c0da82ea3510 sys/src/cmd/awk/run.c
--- a/sys/src/cmd/awk/run.c Sat Dec 08 09:23:05 2012 +0100
+++ b/sys/src/cmd/awk/run.c Mon Dec 10 07:17:56 2012 +0100
@@ -1213,7 +1213,9 @@
FATAL("illegal type of split");
sep = *fs;
ap = execute(a[1]); /* array name */
+ y->tval |= DONTFREE; /* split(a[x], a); */
freesymtab(ap);
+ y->tval &= ~DONTFREE;
dprintf( ("split: s=|%s|, a=%s, sep=|%s|\n", s, ap->nval, fs) );
ap->tval &= ~STR;
ap->tval |= ARR;

--
cinap
c***@gmx.de
2012-12-10 08:20:58 UTC
Permalink
small addition. this introduces a bug. have to save the DONTFREE
flag if its already set! sorry!

--- a/sys/src/cmd/awk/run.c Mon Dec 10 07:20:00 2012 +0100
+++ b/sys/src/cmd/awk/run.c Mon Dec 10 09:19:04 2012 +0100
@@ -1213,9 +1213,10 @@
FATAL("illegal type of split");
sep = *fs;
ap = execute(a[1]); /* array name */
+ n = y->tval;
y->tval |= DONTFREE; /* split(a[x], a); */
freesymtab(ap);
- y->tval &= ~DONTFREE;
+ y->tval = n;
dprintf( ("split: s=|%s|, a=%s, sep=|%s|\n", s, ap->nval, fs) );
ap->tval &= ~STR;
ap->tval |= ARR;

--
cinap
erik quanstrom
2012-12-10 13:13:32 UTC
Permalink
Post by c***@gmx.de
theres a bug in awk's split function. split depopules
the array a, accidently freeing a[1] (y). this doesnt show
up with ape's memory allocator but with plan9's pool
allocator or by modifying freesymtab() to XXX out
the strings before freeing one gets surprising results
awk 'BEGIN{a[1]="a b";print split(a[1],a),a[1],a[2]}'
1 XXX
diff -r c0da82ea3510 sys/src/cmd/awk/run.c
--- a/sys/src/cmd/awk/run.c Sat Dec 08 09:23:05 2012 +0100
+++ b/sys/src/cmd/awk/run.c Mon Dec 10 07:17:56 2012 +0100
@@ -1213,7 +1213,9 @@
FATAL("illegal type of split");
sep = *fs;
ap = execute(a[1]); /* array name */
+ y->tval |= DONTFREE; /* split(a[x], a); */
freesymtab(ap);
+ y->tval &= ~DONTFREE;
dprintf( ("split: s=|%s|, a=%s, sep=|%s|\n", s, ap->nval, fs) );
ap->tval &= ~STR;
ap->tval |= ARR;
i believe bwk fixed this. i ported the newest version (as of a year ago)
back to plan 9. see contrib quanstro/awk.

- erik
David du Colombier
2012-12-12 21:33:57 UTC
Permalink
Post by erik quanstrom
i ported the newest version (as of a year ago)
back to plan 9. see contrib quanstro/awk.
In my side, I also updated AWK to the latest version
(20110810) from Brian Kernighan a year ago.

http://www.9legacy.org/9legacy/patch/awk.diff

I was unsure about some details, so I haven't
submitted the patch.

I noticed your port is based on a different version
(20070501), but it might be interesting to compare
our changes.
--
David du Colombier
erik quanstrom
2012-12-12 21:39:33 UTC
Permalink
Post by David du Colombier
Post by erik quanstrom
i ported the newest version (as of a year ago)
back to plan 9. see contrib quanstro/awk.
In my side, I also updated AWK to the latest version
(20110810) from Brian Kernighan a year ago.
http://www.9legacy.org/9legacy/patch/awk.diff
I was unsure about some details, so I haven't
submitted the patch.
I noticed your port is based on a different version
(20070501), but it might be interesting to compare
our changes.
i made one interesting change. printf "%x" could cause
a floating point exception in awk, since it isn't very careful
about these things. in linux, nobody cares about minor
little things like loss of precision, so nobody has noticed.

since this is an internal problem (due to internal representation,
and how %x works), i decided to intercept the problem and
do something reasonable, rather than abort.

i left the other odd bits from the old port, especially the re
transformations.

- erik
c***@gmx.de
2012-12-12 22:24:53 UTC
Permalink
it looks to me as if eriks contrib/awk and the changes from
9legacy *should* have the same bug that was reported. at least i
cannot find any obvious changes in the diffs that would fix it.

http://www.cs.princeton.edu/~bwk/btl.mirror/

also seems to have the bug.

a way to reproduce the problem was presented (just override
the data on free with some pattern and then run the awk snipped
from the original mail).

it would be good if someone else veryfy this.

--
cinap
erik quanstrom
2012-12-12 22:31:01 UTC
Permalink
Post by c***@gmx.de
it looks to me as if eriks contrib/awk and the changes from
9legacy *should* have the same bug that was reported. at least i
cannot find any obvious changes in the diffs that would fix it.
huh, maybe i misread it. obviously (see charles response), i can't
read. i'll take another look, but you can as well. the binary and
source are on sources.

- erik
c***@gmx.de
2012-12-12 22:38:47 UTC
Permalink
i know. thats where i checked the source :) the bug isnt really relevant
in practice (you cannot see it with the binary). its just a use
after free and ape's free() implementation doesnt override the the freed
data so you wont see any side effects unless you use a different free
implementation.

--
cinap
erik quanstrom
2012-12-13 03:22:58 UTC
Permalink
Post by c***@gmx.de
it looks to me as if eriks contrib/awk and the changes from
9legacy *should* have the same bug that was reported. at least i
cannot find any obvious changes in the diffs that would fix it.
http://www.cs.princeton.edu/~bwk/btl.mirror/
also seems to have the bug.
a way to reproduce the problem was presented (just override
the data on free with some pattern and then run the awk snipped
from the original mail).
it would be good if someone else veryfy this.
i agree, this is a bug. can you explain why the fix does not leak?

- erik
c***@gmx.de
2012-12-13 03:35:47 UTC
Permalink
it is free at the bottom of split(), we call tempfree(y);
there. the fix just *temporarily* marks it as DONTFREE during
the freesymtab() call (which might have the very "y" referenced
and would otherwise free it under us).

--
cinap
erik quanstrom
2012-12-13 03:37:52 UTC
Permalink
lgtm.

- erik

Continue reading on narkive:
Loading...