[Chilli] 100% cpu problem
Marco Simioni
m.simioni at gmail.com
Tue Oct 19 17:10:50 UTC 2010
I have some news.
It happened again, but this time i had my debug messages inserted into
rad_getattr, and here what i see into debug log:
Oct 17 23:27:33 izc coova-chilli[955]: chilli.c: 3142: Received RADIUS response
Oct 17 23:27:33 izc coova-chilli[955]: chilli.c: 3164: Received
Access-Reject from radius server
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 805: radius_getnextattr
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=0, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=21, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 805: radius_getnextattr
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=0, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=21, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 805: radius_getnextattr
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=0, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=21, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 805: radius_getnextattr
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=0, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=21, len=39
So it seems that i received a radius malformed packet, and chilli goes
into an infinite loop calling "radius_getnextattr".
The debug messages are displayed this way (please note log_dbg() calls):
int
radius_getnextattr(struct radius_packet_t *pack, struct radius_attr_t **attr,
uint8_t type, uint32_t vendor_id, uint8_t vendor_type,
int instance, size_t *roffset) {
struct radius_attr_t *t;
size_t len = ntohs(pack->length) - RADIUS_HDRSIZE;
size_t offset = *roffset;
int count = 0;
/*
if (0) {
printf("radius_getattr payload(len=%d,off=%d) %.2x %.2x %.2x %.2x\n",
len, offset, pack->payload[offset], pack->payload[offset+1],
pack->payload[offset+2], pack->payload[offset+3]);
}
*/
log_dbg("radius_getnextattr");
while (offset < len) {
log_dbg("offset=%d, len=%d", offset, len);
t = (struct radius_attr_t *)(&pack->payload[offset]);
/*
if (0) {
printf("radius_getattr %d %d %d %.2x %.2x \n", t->t, t->l,
ntohl(t->v.vv.i), (int) t->v.vv.t, (int) t->v.vv.l);
}
*/
offset += t->l;
if (t->t == 0)
return -1;
if (t->t != type)
continue;
if (t->t == RADIUS_ATTR_VENDOR_SPECIFIC && vendor_id &&
(ntohl(t->v.vv.i) != vendor_id || t->v.vv.t != vendor_type))
continue;
if (count == instance) {
if (type == RADIUS_ATTR_VENDOR_SPECIFIC && vendor_id)
*attr = (struct radius_attr_t *) &t->v.vv.t;
else
*attr = t;
/*
if (0) printf("Found %.*s\n", (*attr)->l - 2, (char *)(*attr)->v.t);
*/
*roffset = offset;
return 0;
}
else {
count++;
}
}
return -1; /* Not found */
}
So, it is not the "while (offset < len) {" that is going into a loop,
but somebody calling the radius_getnextattr() routine.
For example, it is called several times into chilli.c inside while
loops, like this:
Line 2860:
while (!radius_getnextattr(pack, &attr, RADIUS_ATTR_VENDOR_SPECIFIC,
RADIUS_VENDOR_CHILLISPOT, RADIUS_ATTR_CHILLISPOT_CONFIG,
0, &offset)) {
Line 2895:
while (!radius_getnextattr(pack, &attr, RADIUS_ATTR_VENDOR_SPECIFIC,
RADIUS_VENDOR_WISPR, RADIUS_ATTR_WISPR_REDIRECTION_URL,
0, &offset)) {
Line 3026:
while (!radius_getnextattr(pack, &attr,
RADIUS_ATTR_VENDOR_SPECIFIC,
RADIUS_VENDOR_CHILLISPOT,
RADIUS_ATTR_CHILLISPOT_CONFIG,
0, &offset));
I tried inserting further debug messages onto these lines to see what
is the while loop that ends to an infinite loop.
Suggestions ?
Regards,
Marco
2010/10/15 Caciano Machado <caciano at gmail.com>:
> Hi,
>
> Has anyone tried run coova chilli with gprof to figure out which function is
> eating cpu?
>
> Cheers
>
> On Thu, Oct 7, 2010 at 1:13 PM, Giovanni Toraldo <me at gionn.net> wrote:
>>
>> Hi,
>>
>> I am writing this to ley you know that I am facing the same issue.
>>
>> Unfortunately, I cannot give more informations than Marco provided:
>> chilli_redir seems to hangs quiet rapidly, but however the coova
>> service will get only some lag due to high loads generated by the
>> hanging processes.
>>
>> My system is already on production, with an average of 10 users
>> connected, with spikes of thousands.
>> Before the lastest relase, I was never faced this issue.
>>
>> Hope I can help some way to fix this issue.
>>
>> Thanks.
>>
>> --
>> Giovanni Toraldo
>> http://gionn.net/
>> _______________________________________________
>> Chilli mailing list
>> Chilli at coova.org
>> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>
>
> _______________________________________________
> Chilli mailing list
> Chilli at coova.org
> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>
>
More information about the Chilli
mailing list