Linux Freeze

Christian.Daschill at skf.com Christian.Daschill at skf.com
Wed Nov 6 11:49:19 CET 2002


>No, Paolo, in my view the problem is different: when such a report appears
>on the list (if it is related to StrongARM) I suggest possible fixes 
and/or
>request more information, and then the communication stops.

I might be missing something here, but I don't remember receiving any more 
questions after I posted this:

>>>> BEGIN QUOTE
>After playing around we ran in the situation where rtai is working well 
but the Linux has been freezed or >sometimes it crashed.
>This happens when theres set a rt_timer or a load of GPIO irqs driven 
with a square wave generator. (up to >10KHz). 
>Is absolutly not deterministic. When Linux User-Space load is high, it 
happens much more.

Unfortunately I am seeing the same symptoms:
When triggering a GPIO interrupt at 25.6 or 50kHz, reading in some bytes 
and stuffing them in a fifo in the ISR, and just reading from the fifo and 
printf..ing the average every second in the user space routine, the user 
space routine either freezes after a random amount of time or comes with a 
message similar to this:

eth0: Low memory, packet dropped.
Unable to handle kernel NULL pointer dereference at virtual address 
00000018
pgd = c0004000
*pgd = 00000000, *pmd = 00000000
Internal error: Oops: ffffffff
CPU: 0
pc : [<c0095360>]    lr : [<c2800188>]    Not tainted
sp : c0113ee4  ip : 0000003f  fp : c0113f0c
r10: c013f784  r9 : 00000004  r8 : 00000000
r7 : f6000300  r6 : 0000003e  r5 : c0267000  r4 : 00000000
r3 : 00000001  r2 : 00000001  r1 : c2802d8c  r0 : 00000025
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  Segment kernel
Control: C1F7317F  Table: C1F7317F  DAC: 0000001D
Process swapper (pid: 0, stackpage=c0113000)
Stack: (0xc0113ed4 to 0xc0114000)
3ec0:                                              c2800188 c0095360 
20000013
3ee0: ffffffff c032a380 00000005 f6000300 c0267000 000000b3 00000004 
c013f784
3f00: c0113f40 c0113f10 c0093d7c c00952bc 00004068 00003302 c032a380 
00000005
3f20: c0114c50 00000000 c011d71c c0129500 c28036fc c0113f6c c0113f44 
c0015ec0
3f40: c0093cfc 00000005 c2802b74 c2802b68 c2802d8c c2802d7c c2803748 
c2802b68
3f60: c0113fac c0113f70 c2800218 c0015e0c c011dca8 6901b118 60000013 
c0114c50
3f80: c0112000 c0016b5c c011dca8 6901b118 0000001f c0016bac c0016b5c 
c0112000
3fa0: c0113fd0 c0113fb0 c00164b0 c0016b68 00002000 c011d220 c011dcc8 
c011dcc4
3fc0: c0114c48 c0113fe0 c0113fd4 c0013030 c0016464 c0113ffc c0113fe4 
c00088cc
3fe0: c001300c c011e0f4 c01420f0 c01420f0 00000000 c0114000 c0008080 
c0008798
Backtrace:
Function entered at [<c00952b0>] from [<c0093d7c>]
Function entered at [<c0093cf0>] from [<c0015ec0>]
Function entered at [<c0015e00>] from [<c2800218>]
Function entered at [<c0016b5c>] from [<c00164b0>]
 r5 = C0112000  r4 = C0016B5C
Function entered at [<c0016458>] from [<c0013030>]
 r8 = C0114C48  r7 = C011DCC4  r6 = C011DCC8  r5 = C011D220
 r4 = 00002000
Function entered at [<c0013000>] from [<c00088cc>]
Function entered at [<c000878c>] from [<c0008080>]
Code: e5953018 e2833001 e5853018 e286c001 (e584a018)
Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing
 

That's what's left of my ISR:

#define ADCBYTE0  0xf7000000
#define ADCBYTE1  0xf7000001
#define ADCBYTE2  0xf7000002

#define ADCBYTE4  0xf7000004
#define ADCBYTE5  0xf7000005
#define ADCBYTE6  0xf7000006


union eightchars {
        unsigned char uc[8];
        short ss[4];
        long ll[2];
} lwork1;

static void intr_handler(void)
{
        /* read in SD0 - SD7 */
        lwork1.ss[0]=*(short *)(ADCBYTE0);      // read in 2 bytes
        lwork1.uc[2]=*(char *) (ADCBYTE2);      // read in 1 byte
        lwork1.ss[4]=*(short *)(ADCBYTE4);      // read in 2 bytes
        lwork1.uc[6]=*(char *) (ADCBYTE6);      // read in 1 byte

        /* put data into fifo */
        rtf_put(FIFO, &lwork1, sizeof(lwork1));

        rt_enable_irq(MY_IRQ);
}

 
>Were using Linux Kernel 2.4.18-rmk7 in conjunction with rtai 24.1.10.
I use Sysgo ELINOS Arm V2.1 (Linux 2.4.18 with rtai 24.1.8)
>Target is using a Intel SA-1110 (StrongARM) Processor.
Same here, board is a SSV DNP1110
>Gcc Version is 2.95.3.
Same.

Seems to me that, when increasing user space load by printf..ing more 
often or pinging the system, the problem arises quicker.

I was told that this is probably because I am using an "old" RTAI, but if 
you guys have the same problem with the latest release, I am getting 
nervous..

Christian Daschill

>>>> END QUOTE

OK, I did find some questions addressed to somebody else, which might 
apply to my problem as well:

>Ok, questions: 1) when Linux freezes, does RTAI (rt-tasks) continue to
>work? Can you see it, maybe on some GPIO-output, or something like that.
Yes, I can see that the rtai keeps working, because it toggles a GPIO - 
output.

>You only have one CPU, and it has to be shared between RTAI (higher
>priority) and Linux (lower priority), so, if RTAI occupies the system
>completely, Linux has no chance. If you have only 1 rt-task and 1
>rt-interrupt handler, you can try to measure the time the system is
>occupied by rt-tasks, and the time RTAI is idle and Linux is running.
Time occupied by rt-tasks is roughly 20 to 25% (4 to 5 microseconds every 
20 microseconds)

>Which modules are loaded? RTAI? your own rt-modules?
rt_process, rtai_fifos, rtai_sched (probably not necessary), rt_mem_mgr, 
rtai

>Try first of all to
>run some standard RTAI examples, like latency-calibration, and see what
>sort of frequency you can achive. And how the latenc behaves when you
>increase the (linux)-load.
>If standard tests work, then check your modules.
I did not do that yet, simply because these samples were not part of the 
Elinos-package, and I had overlooked that part of your message.

Christian

--------------------------------------------
Ing. Christian Daschill
Systems Development
SKF QTC Austria

Christian.Daschill at skf.com
www.qtc.skf.com
+43 72 52 797 573     Voice
+43 664 361 2627      Mobile 
+43 72 52 797 88573 Fax







More information about the Rtai mailing list