Discussion:
[9fans] local apic unreadable?
(too old to reply)
Richard Miller
2012-06-08 17:07:27 UTC
Permalink
I'm trying to bring up Plan 9 on a new motherboard (intel DH61DL)
and getting a very odd problem. The immediate symptom is
panic: assert failed at 0xf01556f3: lapictimer.hz != 0
which is occurring because all reads to the local apic registers
are returning 0xFFFFFFFF.

I've verified that I can read the lapic from l.s before turning
the mmu on, getting sensible values from the registers.

By calling mmuwalk() I've verified that vmap() has put the right value
in the PTE to map the lapic area uncached:

vmap 0xfee00000 1024 => 0xe0000000
pte (pdb=f0012000) @f0019000 = fee00013

I've verified that the mtrr settings have the lapic regs uncached
(provided that nesting of entries works correctly, since there are
4GB of RAM installed):

cache default uc
cache 0x0 4294967296 wb
cache 0x100000000 536870912 wb
cache 0xdb800000 8388608 uc
cache 0xdc000000 67108864 uc
cache 0xe0000000 536870912 uc
cache 0x11f600000 2097152 uc
cache 0x11f800000 8388608 uc

I've thrown in a few wbinvd() and flushpg() calls out of general
paranoia. Nothing helps.

Can anyone suggest an avenue of investigation? Surely I'm not the
first to try running native Plan 9 with 4GB of RAM?
erik quanstrom
2012-06-08 17:33:49 UTC
Permalink
Post by Richard Miller
I'm trying to bring up Plan 9 on a new motherboard (intel DH61DL)
and getting a very odd problem. The immediate symptom is
panic: assert failed at 0xf01556f3: lapictimer.hz != 0
which is occurring because all reads to the local apic registers
are returning 0xFFFFFFFF.
every one of our machines has 4gb of ram. so i don't
think that's the problem.

would you mind trying this kernel just once?
http://ftp.quanstro.net/other/9pccpuf.gz # or other kernel name

i made some changes that may help if the problem is in
lapic identification & initialization. if it does work, we
can figure out exactly what makes it work.

- erik
Richard Miller
2012-06-08 17:53:29 UTC
Permalink
Post by erik quanstrom
http://ftp.quanstro.net/other/9pccpuf.gz # or other kernel name
That works, thanks!

LAPIC: fee00000 e0000000
rsd 0xf00f0450, physaddr 0xda998028 length 36 0xda998068 rev 2 oem INTEL
APIC lapic addr 0xfee00000, flags 0x00000001
apic proc 0/0 apicid 0 flags (mp)
apic proc 1/1 apicid 2 flags (mp)
apic proc 2/2 apicid 4 flags (mp)
apic proc 3/3 apicid 6 flags (mp)
ioapic 0 addr fec00000 base 0 (mp)
apicnos: 00/01 02/02 04/04 06/08
apic: 4 machs started; flat mode vectors
pcirouting: ignoring south bridge PCI.0.31.0 8086/1C5C
... etc

Let's continue offline and I'll report to 9fans when we've
pinpointed the relvant fix.
erik quanstrom
2012-06-08 18:01:42 UTC
Permalink
Post by Richard Miller
Post by erik quanstrom
http://ftp.quanstro.net/other/9pccpuf.gz # or other kernel name
That works, thanks!
LAPIC: fee00000 e0000000
rsd 0xf00f0450, physaddr 0xda998028 length 36 0xda998068 rev 2 oem INTEL
APIC lapic addr 0xfee00000, flags 0x00000001
apic proc 0/0 apicid 0 flags (mp)
apic proc 1/1 apicid 2 flags (mp)
apic proc 2/2 apicid 4 flags (mp)
apic proc 3/3 apicid 6 flags (mp)
ioapic 0 addr fec00000 base 0 (mp)
apicnos: 00/01 02/02 04/04 06/08
apic: 4 machs started; flat mode vectors
pcirouting: ignoring south bridge PCI.0.31.0 8086/1C5C
... etc
Let's continue offline and I'll report to 9fans when we've
pinpointed the relvant fix.
ah, you're welcome. it's good to have things working!

the source for that kernel is @ http://ftp.quanstro.net/other/kernel.mkfs.bz2

- erik
Richard Miller
2012-06-11 11:00:51 UTC
Permalink
OK, solved. This motherboard (or bios) gives the io apic an id of 0,
same as the boot processor's local apic. The unexpected aliasing
causes the lapic (virtual) address to be overwritten by the ioapic
address, so the lapic timer code is looking in the wrong place for its
registers.

This was actually corrected on sources by a change to mp.c on 17 May
(by making two separate tables for local apics and io apics).
Unfortunately the /386/9pc on sources, although also dated 17 May,
doesn't include the fix, so I didn't realise it had been fixed till I
tried rebuilding the kernel from a fresh replica/pull.

Erik's kernel has also had a similar fix, presumably made some time
ago - there's a 9fans post about a "wierd lapic/ioapic configuration"
which I should have spotted as a clue ...
erik quanstrom
2012-06-11 13:03:27 UTC
Permalink
Post by Richard Miller
OK, solved. This motherboard (or bios) gives the io apic an id of 0,
same as the boot processor's local apic. The unexpected aliasing
causes the lapic (virtual) address to be overwritten by the ioapic
address, so the lapic timer code is looking in the wrong place for its
registers.
though unexpected, it's not aliasing. lapics and apics have different
apic id namespaces.
Post by Richard Miller
Erik's kernel has also had a similar fix, presumably made some time
ago - there's a 9fans post about a "wierd lapic/ioapic configuration"
which I should have spotted as a clue ...
1. may 2011, according to the dump.

- erik

Loading...