[Chilli] 100% cpu problem

Marco Simioni m.simioni at gmail.com
Thu Oct 7 08:54:20 UTC 2010


Thank you alberto for your answer.

I tried to debug another time with gdb, and i saw it was into the
"rad_getnextattr" again.

I'm now trying to look into the "rad_getnextattr()" function.

I see there is a while loop, that could turn into potential endless
loop: "while (offset < len)".

I inserted some debug messages, let's see the next time it happens

Then, i'll try your suggestions.

Thank you.

Regards,

Marco

2010/10/6 Alberto Bellettato <albesvs at yahoo.it>:
> Have you tried to temporarily remove the "/etc/rc2.d/S20chilli radconfig"
> script by your cron table? It could help isolating the problem.
> Then have you enabled ssl or redir in your chilli config?
>
> ----- Original Message ----- From: "Marco Simioni" <m.simioni at gmail.com>
> To: <chilli at coova.org>
> Sent: Wednesday, October 06, 2010 11:26 AM
> Subject: Re: [Chilli] 100% cpu problem
>
>
> No news.
>
> Stil happened.
>
> I'm using latest SVN version.
>
> How can i investigate the problem ?
>
> 2010/9/23 Marco Simioni <m.simioni at gmail.com>:
>>
>> It happened again: i had 100% cpu after days of uptime.
>>
>> This time, i was able to identify the following:
>>
>> 1) Thanks to "sar" utility, i was able to localize the CPU usage
>> between 09:45 and 10:05.
>>
>> 09:05:02 CPU %user %nice %system %iowait %steal %idle
>> 09:45:02 all 0,00 0,00 0,04 4,26 0,00 95,70
>> 09:55:02 all 2,51 0,00 0,08 3,47 0,00 93,94
>> 10:05:02 all 98,32 0,00 1,68 0,00 0,00 0,00
>>
>> 2) In syslog, the messages i see are:
>>
>> Sep 23 09:51:10 izc coova-chilli[939]: chilli.c: 3402: DHCP addr
>> released by MAC=90-84-0D-D2-00-2D IP=0.0.0.0
>> Sep 23 09:51:11 izc coova-chilli[939]: chilli.c: 3248: New DHCP
>> request from MAC=90-84-0D-D2-00-2D
>> Sep 23 09:52:02 izc coova-chilli[939]: chilli.c: 3248: New DHCP
>> request from MAC=00-0E-6A-7A-AB-9C
>> Sep 23 09:52:02 izc coova-chilli[939]: chilli.c: 3209: Client
>> MAC=00-0E-6A-7A-AB-9C assigned IP 10.1.0.73
>> Sep 23 09:55:02 izc CRON[9452]: (root) CMD (command -v debian-sa1 >
>> /dev/null && debian-sa1 1 1)
>> Sep 23 10:00:03 izc CRON[9455]: (root) CMD (/etc/rc2.d/S20chilli
>> radconfig)
>> Sep 23 10:05:01 izc CRON[9461]: (root) CMD (command -v debian-sa1 >
>> /dev/null && debian-sa1 1 1)
>>
>> 3) I tried to attach with "gdb" to the chilli process. Nothing
>> happened. I didn't know what to do, so tried to make a step and i got
>> the following:
>>
>> (gdb) step
>> Single stepping until exit from function radius_getnextattr, wich has
>> no line number information.
>>
>> than nothing else.
>>
>> 4) Tried to attach with "strace"
>>
>> root at izc:# strace -p 939
>> Process 939 attached - interrupt to quit
>>
>> and nothing happened.
>>
>> Now i had to reboot to let customers surf.
>>
>> What can i do next time?
>>
>> The syslog "radconf" and the gdb message "radius_getnextattr" could
>> point to something ?
>>
>> Keep in mind that the radius server is a proprietary one, it is not
>> freeradius or something else.
>>
>> Is there something i can to the next time with gdb and/or strace ?
>>
>> Thank you again.
>>
>> 2010/7/27 David Bird <david at coova.com>:
>>>
>>> Wichert asked a good question; are you using SSL features of
>>> CoovaChilli?
>>>
>>> For info on gdb, you can google for it, of course. Here is a quick howto
>>> page:
>>> http://www.freebsd.org/doc/en/books/developers-handbook/debugging.html
>>>
>>> For strace, use "strace -p <pid>" and you will see the system calls
>>> being executed - if it is using 100%, there must be a runaway loop
>>> occurring.
>>>
>>> David
>>>
>>> On Tue, 2010-07-27 at 09:20 +0200, Marco Simioni wrote:
>>>>
>>>> It happened 3 times in a month,
>>>>
>>>> not immediately but after some day of regular work.
>>>>
>>>> "top" command sayd chilli was consuming 100% CPU.
>>>>
>>>> customers reported that they could not get dhcp and then login page.
>>>>
>>>> i will try to use chilli_query when will happen again.
>>>>
>>>> how can i attach with gdb or strace? can you point me some
>>>> documentation?
>>>>
>>>> i can run it in debug mode only in console mode, with -fd, or also as
>>>> a service ?
>>>>
>>>> thank you i.a.
>>>>
>>>> regards,
>>>>
>>>> Marco
>>>>
>>>> 2010/7/27 David Bird <david at coova.com>:
>>>> > How quickly does this start to happen? Immediately? After how long?
>>>> >
>>>> > Is chilli also not working during this time? Does chilli_query hang?
>>>> >
>>>> > Are you able to attach gdb or use strace to get more info?
>>>> >
>>>> > If able, you can try running in debug mode for additional log
>>>> > information?
>>>> >
>>>> > Thanks,
>>>> > David
>>>> >
>>>> >
>>>> > On Tue, 2010-07-27 at 08:55 +0200, Marco Simioni wrote:
>>>> >> Hi all, my customer is reporting a cpu problem.
>>>> >>
>>>> >> Chilli goes to consume all the processor, going to 100%.
>>>> >>
>>>> >> It is a brand new setup, bult on a VMWare ESXi Virtual Machine on HP
>>>> >> >> ML115,
>>>> >> 1.8GHz CPU allocated,
>>>> >> 512MB RAM allocated,
>>>> >> coova-chilli 1.2.2,
>>>> >> Ubuntu 9.10 ( 2.6.31-14-server ).
>>>> >>
>>>> >> It's about three times it happens, solved it with a reboot.
>>>> >>
>>>> >> Very few clients, < 10.
>>>> >>
>>>> >> Suggestions ?
>>>> >>
>>>> >> How can i investigate and understand when it happens ?
>>>> >>
>>>> >> Thank i.a.
>>>> >>
>>>> >> Best regards,
>>>> >>
>>>> >> Marco
>>>> >> _______________________________________________
>>>> >> Chilli mailing list
>>>> >> Chilli at coova.org
>>>> >> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>>>> >
>>>> >
>>>> >
>>>
>>>
>>>
>>
> _______________________________________________
> Chilli mailing list
> Chilli at coova.org
> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
> _______________________________________________
> Chilli mailing list
> Chilli at coova.org
> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>


More information about the Chilli mailing list