Author Topic: Punix  (Read 43212 times)

0 Members and 1 Guest are viewing this topic.

Offline christop

  • LV3 Member (Next: 100)
  • ***
  • Posts: 87
  • Rating: +20/-0
    • View Profile
Re: Punix
« Reply #30 on: March 04, 2011, 01:56:52 pm »
AHA!

I moved the SSP to 0x3000 and it still goes to 0x0000002f. This means it's not caused by a write to the screen. BUT... I didn't notice this before but upon entry to the address exception handler, the SSP is 0x2ef2. Add 6 bytes to get 0x2ef8, which is what it was immediately before the address exception. This is 264 bytes below what it should be (0x3000).

Indeed, this is the contents between 0x2ffa and 0x2fff:
2ffa: 0000
2ffc: 0041
2ffe: 6f1c

That is the saved SR and PC from a trap. 00416f1c is the address following a trap #0 in userspace (with a move #3,%d0 before the trap, which means this is a read() system call--just as I suspected).

What this means is that some time before the address exception (probably right before), some exception or trap handler returned without popping all its registers off the stack (264 bytes' worth). I just need to narrow down which handler is responsible and then take it out back and shoot it. After it's dead I'll replace it with a better version. :)
Christopher Williams

Offline christop

  • LV3 Member (Next: 100)
  • ***
  • Posts: 87
  • Rating: +20/-0
    • View Profile
Re: Punix
« Reply #31 on: March 04, 2011, 03:13:07 pm »
I replaced ALL rte instructions with "jbra check_rte", and in check_rte I make sure the stack pointer is correct (0x2ffa) if the saved SR has the supervisor bit clear (which means it is returning back to user space). I also test the return address against the value 0x0000002f. If either of those tests fails, it halts right there. Guess what? Nothing failed the test, but the address exception still happens. :'(
Christopher Williams

Offline christop

  • LV3 Member (Next: 100)
  • ***
  • Posts: 87
  • Rating: +20/-0
    • View Profile
Re: Punix
« Reply #32 on: March 05, 2011, 02:26:01 am »
Alright, after many hours of searching for the root cause, I still haven't a real clue. But I finally have a real starting place.

Inside the system call (trap #0) handler I set the interrupt level to 1 to block the system clock (level 1) interrupt from running during a system call. This seems to stop the address error from happening. This isn't an actual fix, though, since I still want the system clock to run during system calls (otherwise the system slows down due to missed interrupts). This only points the finger at the system clock interrupt handler, or some bad interaction between it and a system call.

During this time I also uncovered a handful of other minor synchronization issues and bugs, and I greatly simplified the system clock (it was a kind of kludgey mix of the int 1 and int 3 clocks), so this bug hunt wasn't a complete waste of time. None of these changes warrant a new beta, so I'll just keep looking for this pesky bug.
Christopher Williams

Offline christop

  • LV3 Member (Next: 100)
  • ***
  • Posts: 87
  • Rating: +20/-0
    • View Profile
Re: Punix
« Reply #33 on: March 05, 2011, 12:58:25 pm »
Okay, so I found the problem. It is in this line in set_state() in sched.c:
Code: [Select]
               if (time_before(p->p_deadline, current->p_deadline)) {
Somehow the value 1 is passed in for p, and the p_deadline member is at offset 0x2e in the proc structure, so this is attempting to access a long value at p->p_deadline at address 0x2f. :)

So the address exception doesn't contain the return address but rather the address that some code attempted to access non-word aligned.

Now I finally bother reading M68000PRM.pdf. Here's the stack frame for address error exceptions (in pretty ASCII art :)):
Code: [Select]
15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
+-----------------------------------------------+
|                                |RW|IN|  func  | <-- lower address (%sp)
+-----------------------------------------------+
|             access address (high)             |
+                                               +
|             access address (low)              |
+-----------------------------------------------+
|             instruction register              |
+-----------------------------------------------+
|               status register                 |
+-----------------------------------------------+
|             program counter (high)            |
+                                               +
|             program counter (low)             |
+-----------------------------------------------+
RW (Read/Write): Write = 0, Read = 1
IN (Instruction/Not): Instruction = 0, Not = 1

Therefore this address exception and the stack frame make more sense.

I still have to figure out what is calling set_state() with the value 1.

EDIT:  :crazy: I just found it. I called set_run (which calls set_state) without any arguments, so whatever value happened to be on the stack was used as the value for p. FAIL  :banghead:

EDIT 2: Lo and behold, that fixed the problem!! I guess I should use function prototypes more thoroughly, since TIGCC would give an error if I had a prototype for sched_run().

EDIT 3: Actually, calling sched_run() from that place (in tsleep() in process.c) was a bug itself. The process was already scheduled to run at that time so it was trying to schedule a process that was already running. I just had to remove the line completely. Now my scheduler and (due to my recent clock/timer fixes) timers are extremely stable!  :w00t:
« Last Edit: March 05, 2011, 03:12:37 pm by christop »
Christopher Williams

Offline Jim Bauwens

  • Lua! Nspire! Linux!
  • Editor
  • LV10 31337 u53r (Next: 2000)
  • **********
  • Posts: 1881
  • Rating: +206/-7
  • Linux!
    • View Profile
    • nothing...
Re: Punix
« Reply #34 on: March 05, 2011, 04:07:04 pm »
Congratulations!
Good coding!  :D

Offline Lionel Debroux

  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2135
  • Rating: +290/-45
    • View Profile
    • TI-Chess Team
Re: Punix
« Reply #35 on: March 07, 2011, 12:27:36 am »
Good ;)
So my wild guess about a read at offset 0x2E from address 1 actually proved to be correct...

The bug you've reported on the TIEmu SF tracker looks odd. TIEmu has no business reading data one byte off their valid locations...
Member of the TI-Chess Team.
Co-maintainer of GCC4TI (GCC4TI online documentation), TILP and TIEmu.
Co-admin of TI-Planet.

Offline christop

  • LV3 Member (Next: 100)
  • ***
  • Posts: 87
  • Rating: +20/-0
    • View Profile
Re: Punix
« Reply #36 on: March 07, 2011, 05:43:37 am »
Good ;)
So my wild guess about a read at offset 0x2E from address 1 actually proved to be correct...
Yup. I think I'll improve my exception handlers one of these days so they'll provide more information about the exception in human-readable formats (only when it happens in kernel mode. I'll want to send a signal and/or kill a process if it happens in user mode)

The bug you've reported on the TIEmu SF tracker looks odd. TIEmu has no business reading data one byte off their valid locations...
I noticed in both working and non-working (crashing) TIB files the long value 0xCCCCCCCC immediately before the SSP and PC, but in the crashing TIB, there is an additional 0xCC byte before that long value. That is probably just the last byte of a length or pointer value. Does TiEmu look for 0xCCCCCCCC as a marker or signature of sorts? If so then it would start at the previous byte for that reason.
Christopher Williams

Offline Lionel Debroux

  • LV11 Super Veteran (Next: 3000)
  • ***********
  • Posts: 2135
  • Rating: +290/-45
    • View Profile
    • TI-Chess Team
Re: Punix
« Reply #37 on: March 07, 2011, 06:37:09 am »
Yes, TIEmu looks for 0xCC 0xCC 0xCC 0xCC. Not when importing a TIB (src/core/images.c), but in src/core/ti_hw/flash.c::find_ssp_and_pc. And if the OS's size happens to end with 0xCC (which you describe), bad things occur...

=> for the time being, just add two bytes of waste to your OS, and you should be fine. I'll fix TIEmu :)
Member of the TI-Chess Team.
Co-maintainer of GCC4TI (GCC4TI online documentation), TILP and TIEmu.
Co-admin of TI-Planet.

Offline DJ Omnimaga

  • Clacualters are teh gr33t
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55943
  • Rating: +3154/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • Dream of Omnimaga Music
Re: Punix
« Reply #38 on: March 07, 2011, 10:06:10 pm »
This is the first time I see a quadruple post here I think, although it's fine due to the large period of time between all. That goes to tell how we need a few more 68K users. D:

I'm glad this is still progressing well. :D

Offline christop

  • LV3 Member (Next: 100)
  • ***
  • Posts: 87
  • Rating: +20/-0
    • View Profile
Re: Punix
« Reply #39 on: March 07, 2011, 11:26:04 pm »
This is the first time I see a quadruple post here I think, although it's fine due to the large period of time between all. That goes to tell how we need a few more 68K users. D:

I'm glad this is still progressing well. :D
Heh, it's actually five. :) I didn't realize I posted so many in a row when I did it. You're right, we do need more 68k folks around here.

Anyway, I just added a busy indicator to the bottom status line (next to the battery status). I think it's pretty cool watching it flicker as programs run and pause:


What does everyone think about the status icon itself? I don't particularly care for it, but it was the best I could come up with. Can anyone draw a better one? (Should I ask this in the Pixel Art and Drawing board instead?)
Christopher Williams

Offline DJ Omnimaga

  • Clacualters are teh gr33t
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55943
  • Rating: +3154/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • Dream of Omnimaga Music
Re: Punix
« Reply #40 on: March 07, 2011, 11:29:16 pm »
Lol I missed the other post. XD

As for the busy indicator it seems fine, although it would probably be better with an hourglass or something, which, unfortunately, might not fit. If the icon kinda blinks it might be fine, though, since it would be like that computer light that blinks when stuff is loading.

If you need a better icon, you could ask in the Pixel Art section, although sometimes it takes weeks to get a reply.

Offline christop

  • LV3 Member (Next: 100)
  • ***
  • Posts: 87
  • Rating: +20/-0
    • View Profile
Re: Punix
« Reply #41 on: March 08, 2011, 02:04:14 am »
Ok, I changed it to a tiny hourglass. I already tried to draw an hourglass before I posted, but I didn't like the way it turned out. Speaking of the status icons, I don't really like how the hand icon looks either. I think it looks like a claw. :) I don't really have a use for the "claw" button anyway. Maybe I could use it to switch to different virtual terminals using claw+F1 through F8.
Christopher Williams

Offline DJ Omnimaga

  • Clacualters are teh gr33t
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55943
  • Rating: +3154/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • Dream of Omnimaga Music
Re: Punix
« Reply #42 on: March 08, 2011, 03:54:50 am »
Ah ok, for me the icon next to the battery seemed more like a star. :P

Offline christop

  • LV3 Member (Next: 100)
  • ***
  • Posts: 87
  • Rating: +20/-0
    • View Profile
Re: Punix
« Reply #43 on: March 09, 2011, 03:43:05 pm »
I just released Beta 3, now with vfork()! This means multi-tasking works now.

The current shell can't run any pipelines or background processes, so multi-tasking is only marginally useful at this stage, but it's an important piece of the puzzle for later on.

Download the TIB file now, or read the release notes.
Christopher Williams

Offline DJ Omnimaga

  • Clacualters are teh gr33t
  • CoT Emeritus
  • LV15 Omnimagician (Next: --)
  • *
  • Posts: 55943
  • Rating: +3154/-232
  • CodeWalrus founder & retired Omnimaga founder
    • View Profile
    • Dream of Omnimaga Music
Re: Punix
« Reply #44 on: March 10, 2011, 03:52:28 am »
Wow, multitasking. I'm glad to see this being implemented on calc. :D