[Chilli] 100% cpu problem

Marco Simioni m.simioni at gmail.com
Thu Oct 7 09:35:11 UTC 2010


I'll let you know.

I printed to log both offset and len.

Let's see what happens.

Regards.

2010/10/7 David Bird <david at coova.com>:
> In getnextattr offset should always increment, and both offset and len
> are unsigned. Though, to be safe (to protect against a bad radius
> packet), we could change:
>
> -    if (t->t == 0)
> +    if (t->t == 0 || t->l < 2)
>       return -1;
>
> On Thu, 2010-10-07 at 10:54 +0200, Marco Simioni wrote:
>> Thank you alberto for your answer.
>>
>> I tried to debug another time with gdb, and i saw it was into the
>> "rad_getnextattr" again.
>>
>> I'm now trying to look into the "rad_getnextattr()" function.
>>
>> I see there is a while loop, that could turn into potential endless
>> loop: "while (offset < len)".
>>
>> I inserted some debug messages, let's see the next time it happens
>>
>> Then, i'll try your suggestions.
>>
>> Thank you.
>>
>> Regards,
>>
>> Marco
>>
>> 2010/10/6 Alberto Bellettato <albesvs at yahoo.it>:
>> > Have you tried to temporarily remove the "/etc/rc2.d/S20chilli radconfig"
>> > script by your cron table? It could help isolating the problem.
>> > Then have you enabled ssl or redir in your chilli config?
>> >
>> > ----- Original Message ----- From: "Marco Simioni" <m.simioni at gmail.com>
>> > To: <chilli at coova.org>
>> > Sent: Wednesday, October 06, 2010 11:26 AM
>> > Subject: Re: [Chilli] 100% cpu problem
>> >
>> >
>> > No news.
>> >
>> > Stil happened.
>> >
>> > I'm using latest SVN version.
>> >
>> > How can i investigate the problem ?
>> >
>> > 2010/9/23 Marco Simioni <m.simioni at gmail.com>:
>> >>
>> >> It happened again: i had 100% cpu after days of uptime.
>> >>
>> >> This time, i was able to identify the following:
>> >>
>> >> 1) Thanks to "sar" utility, i was able to localize the CPU usage
>> >> between 09:45 and 10:05.
>> >>
>> >> 09:05:02 CPU %user %nice %system %iowait %steal %idle
>> >> 09:45:02 all 0,00 0,00 0,04 4,26 0,00 95,70
>> >> 09:55:02 all 2,51 0,00 0,08 3,47 0,00 93,94
>> >> 10:05:02 all 98,32 0,00 1,68 0,00 0,00 0,00
>> >>
>> >> 2) In syslog, the messages i see are:
>> >>
>> >> Sep 23 09:51:10 izc coova-chilli[939]: chilli.c: 3402: DHCP addr
>> >> released by MAC=90-84-0D-D2-00-2D IP=0.0.0.0
>> >> Sep 23 09:51:11 izc coova-chilli[939]: chilli.c: 3248: New DHCP
>> >> request from MAC=90-84-0D-D2-00-2D
>> >> Sep 23 09:52:02 izc coova-chilli[939]: chilli.c: 3248: New DHCP
>> >> request from MAC=00-0E-6A-7A-AB-9C
>> >> Sep 23 09:52:02 izc coova-chilli[939]: chilli.c: 3209: Client
>> >> MAC=00-0E-6A-7A-AB-9C assigned IP 10.1.0.73
>> >> Sep 23 09:55:02 izc CRON[9452]: (root) CMD (command -v debian-sa1 >
>> >> /dev/null && debian-sa1 1 1)
>> >> Sep 23 10:00:03 izc CRON[9455]: (root) CMD (/etc/rc2.d/S20chilli
>> >> radconfig)
>> >> Sep 23 10:05:01 izc CRON[9461]: (root) CMD (command -v debian-sa1 >
>> >> /dev/null && debian-sa1 1 1)
>> >>
>> >> 3) I tried to attach with "gdb" to the chilli process. Nothing
>> >> happened. I didn't know what to do, so tried to make a step and i got
>> >> the following:
>> >>
>> >> (gdb) step
>> >> Single stepping until exit from function radius_getnextattr, wich has
>> >> no line number information.
>> >>
>> >> than nothing else.
>> >>
>> >> 4) Tried to attach with "strace"
>> >>
>> >> root at izc:# strace -p 939
>> >> Process 939 attached - interrupt to quit
>> >>
>> >> and nothing happened.
>> >>
>> >> Now i had to reboot to let customers surf.
>> >>
>> >> What can i do next time?
>> >>
>> >> The syslog "radconf" and the gdb message "radius_getnextattr" could
>> >> point to something ?
>> >>
>> >> Keep in mind that the radius server is a proprietary one, it is not
>> >> freeradius or something else.
>> >>
>> >> Is there something i can to the next time with gdb and/or strace ?
>> >>
>> >> Thank you again.
>> >>
>> >> 2010/7/27 David Bird <david at coova.com>:
>> >>>
>> >>> Wichert asked a good question; are you using SSL features of
>> >>> CoovaChilli?
>> >>>
>> >>> For info on gdb, you can google for it, of course. Here is a quick howto
>> >>> page:
>> >>> http://www.freebsd.org/doc/en/books/developers-handbook/debugging.html
>> >>>
>> >>> For strace, use "strace -p <pid>" and you will see the system calls
>> >>> being executed - if it is using 100%, there must be a runaway loop
>> >>> occurring.
>> >>>
>> >>> David
>> >>>
>> >>> On Tue, 2010-07-27 at 09:20 +0200, Marco Simioni wrote:
>> >>>>
>> >>>> It happened 3 times in a month,
>> >>>>
>> >>>> not immediately but after some day of regular work.
>> >>>>
>> >>>> "top" command sayd chilli was consuming 100% CPU.
>> >>>>
>> >>>> customers reported that they could not get dhcp and then login page.
>> >>>>
>> >>>> i will try to use chilli_query when will happen again.
>> >>>>
>> >>>> how can i attach with gdb or strace? can you point me some
>> >>>> documentation?
>> >>>>
>> >>>> i can run it in debug mode only in console mode, with -fd, or also as
>> >>>> a service ?
>> >>>>
>> >>>> thank you i.a.
>> >>>>
>> >>>> regards,
>> >>>>
>> >>>> Marco
>> >>>>
>> >>>> 2010/7/27 David Bird <david at coova.com>:
>> >>>> > How quickly does this start to happen? Immediately? After how long?
>> >>>> >
>> >>>> > Is chilli also not working during this time? Does chilli_query hang?
>> >>>> >
>> >>>> > Are you able to attach gdb or use strace to get more info?
>> >>>> >
>> >>>> > If able, you can try running in debug mode for additional log
>> >>>> > information?
>> >>>> >
>> >>>> > Thanks,
>> >>>> > David
>> >>>> >
>> >>>> >
>> >>>> > On Tue, 2010-07-27 at 08:55 +0200, Marco Simioni wrote:
>> >>>> >> Hi all, my customer is reporting a cpu problem.
>> >>>> >>
>> >>>> >> Chilli goes to consume all the processor, going to 100%.
>> >>>> >>
>> >>>> >> It is a brand new setup, bult on a VMWare ESXi Virtual Machine on HP
>> >>>> >> >> ML115,
>> >>>> >> 1.8GHz CPU allocated,
>> >>>> >> 512MB RAM allocated,
>> >>>> >> coova-chilli 1.2.2,
>> >>>> >> Ubuntu 9.10 ( 2.6.31-14-server ).
>> >>>> >>
>> >>>> >> It's about three times it happens, solved it with a reboot.
>> >>>> >>
>> >>>> >> Very few clients, < 10.
>> >>>> >>
>> >>>> >> Suggestions ?
>> >>>> >>
>> >>>> >> How can i investigate and understand when it happens ?
>> >>>> >>
>> >>>> >> Thank i.a.
>> >>>> >>
>> >>>> >> Best regards,
>> >>>> >>
>> >>>> >> Marco
>> >>>> >> _______________________________________________
>> >>>> >> Chilli mailing list
>> >>>> >> Chilli at coova.org
>> >>>> >> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>
>> >>>
>> >>>
>> >>
>> > _______________________________________________
>> > Chilli mailing list
>> > Chilli at coova.org
>> > http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>> > _______________________________________________
>> > Chilli mailing list
>> > Chilli at coova.org
>> > http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>> >
>> _______________________________________________
>> Chilli mailing list
>> Chilli at coova.org
>> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>
>
>


More information about the Chilli mailing list