[Chilli] 100% cpu problem

Marco Simioni m.simioni at gmail.com
Tue Oct 19 17:10:50 UTC 2010


I have some news.

It happened again, but this time i had my debug messages inserted into
rad_getattr, and here what i see into debug log:

Oct 17 23:27:33 izc coova-chilli[955]: chilli.c: 3142: Received RADIUS response
Oct 17 23:27:33 izc coova-chilli[955]: chilli.c: 3164: Received
Access-Reject from radius server
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 805: radius_getnextattr
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=0, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=21, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 805: radius_getnextattr
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=0, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=21, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 805: radius_getnextattr
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=0, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=21, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 805: radius_getnextattr
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=0, len=39
Oct 17 23:27:33 izc coova-chilli[955]: radius.c: 808: offset=21, len=39

So it seems that i received a radius malformed packet, and chilli goes
into an infinite loop  calling "radius_getnextattr".

The debug messages are displayed this way (please note log_dbg() calls):

int
radius_getnextattr(struct radius_packet_t *pack, struct radius_attr_t **attr,
	       uint8_t type, uint32_t vendor_id, uint8_t vendor_type,
	       int instance, size_t *roffset) {
  struct radius_attr_t *t;
  size_t len = ntohs(pack->length) - RADIUS_HDRSIZE;
  size_t offset = *roffset;
  int count = 0;

  /*
  if (0) {
    printf("radius_getattr payload(len=%d,off=%d) %.2x %.2x %.2x %.2x\n",
	   len, offset, pack->payload[offset], pack->payload[offset+1],
	   pack->payload[offset+2], pack->payload[offset+3]);
  }
  */

  log_dbg("radius_getnextattr");

  while (offset < len) {
    log_dbg("offset=%d, len=%d", offset, len);

    t = (struct radius_attr_t *)(&pack->payload[offset]);

    /*
    if (0) {
      printf("radius_getattr %d %d %d %.2x %.2x \n", t->t, t->l,
	     ntohl(t->v.vv.i), (int) t->v.vv.t, (int) t->v.vv.l);
    }
    */

    offset +=  t->l;

    if (t->t == 0)
      return -1;

    if (t->t != type)
      continue;

    if (t->t == RADIUS_ATTR_VENDOR_SPECIFIC && vendor_id &&
	(ntohl(t->v.vv.i) != vendor_id || t->v.vv.t != vendor_type))
      continue;

    if (count == instance) {

      if (type == RADIUS_ATTR_VENDOR_SPECIFIC && vendor_id)
	*attr = (struct radius_attr_t *) &t->v.vv.t;
      else
	*attr = t;

      /*
      if (0) printf("Found %.*s\n", (*attr)->l - 2, (char *)(*attr)->v.t);
      */

      *roffset = offset;
      return 0;
    }
    else {
      count++;
    }
  }

  return -1; /* Not found */
}


So, it is not the "while (offset < len) {" that is going into a loop,
but somebody calling the radius_getnextattr() routine.

For example, it is called several times into chilli.c inside while
loops, like this:

Line 2860:

    while (!radius_getnextattr(pack, &attr, RADIUS_ATTR_VENDOR_SPECIFIC,
			       RADIUS_VENDOR_CHILLISPOT, RADIUS_ATTR_CHILLISPOT_CONFIG,
			       0, &offset)) {

Line 2895:

    while (!radius_getnextattr(pack, &attr, RADIUS_ATTR_VENDOR_SPECIFIC,
			       RADIUS_VENDOR_WISPR, RADIUS_ATTR_WISPR_REDIRECTION_URL,
			       0, &offset)) {

Line 3026:

	while (!radius_getnextattr(pack, &attr,
				   RADIUS_ATTR_VENDOR_SPECIFIC,
				   RADIUS_VENDOR_CHILLISPOT,
				   RADIUS_ATTR_CHILLISPOT_CONFIG,
				   0, &offset));


I tried inserting further debug messages onto these lines to see what
is the while loop that ends to an infinite loop.

Suggestions ?

Regards,

Marco


2010/10/15 Caciano Machado <caciano at gmail.com>:
> Hi,
>
> Has anyone tried run coova chilli with gprof to figure out which function is
> eating cpu?
>
> Cheers
>
> On Thu, Oct 7, 2010 at 1:13 PM, Giovanni Toraldo <me at gionn.net> wrote:
>>
>> Hi,
>>
>> I am writing this to ley you know that I am facing the same issue.
>>
>> Unfortunately, I cannot give more informations than Marco provided:
>> chilli_redir seems to hangs quiet rapidly, but however the coova
>> service will get only some lag due to high loads generated by the
>> hanging processes.
>>
>> My system is already on production, with an average of 10 users
>> connected, with spikes of thousands.
>> Before the lastest relase, I was never faced this issue.
>>
>> Hope I can help some way to fix this issue.
>>
>> Thanks.
>>
>> --
>> Giovanni Toraldo
>> http://gionn.net/
>> _______________________________________________
>> Chilli mailing list
>> Chilli at coova.org
>> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>
>
> _______________________________________________
> Chilli mailing list
> Chilli at coova.org
> http://lists.coova.org/cgi-bin/mailman/listinfo/chilli
>
>


More information about the Chilli mailing list