[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problems with iprop



I once got into a mode where iprop crash dumps, the Heimdal log  
files, and the DB itself were all fighting for space on my /var  
partition.  Once the partition fills up then it can't write DB  
updates, which causes it to try to write log messages, which it can't  
write, which causes it to consume memory until it crashes, which  
creates a crash dump that a cron process tries to save in /var twice  
a day, . . .  ;-)

I never did backtrace the original problem, but you can't save a 2GB  
crash dump into a partition that's only 2GB, even if it is mostly  
empty most of the time.  ;-)

The solution was to erase all the crash dumps and text log files, and  
truncate the binary change log file with "/usr/heimdal/sbin/ 
truncate_log".  For Heimdal 0.8+ the equivalent command would be "/ 
usr/heimdal/sbin/iprop-log truncate".  (I suggest you shut down the  
KDC and iprop while you do this to avoid issues with file locking.)   
This will erase all logged changes, but preserve the DB version  
number so you don't have to re-download the whole DB.

On Jul 25, 2007, at 7:55 AM, Dr A V Le Blanc wrote:

> Until recently I've been running heimdal 0.7.2.dfsg.1-10 on Debian
> etch systems, and I have had occasional problems with iprop, which
> have been getting worse.
>
> First, I am getting thousands of error messages on the slaves:
>
>      ipropd-slave[8760]: kadm5_log_replay: 2469: Entry already  
> exists in database
>
> There are so many of these that the disks are filling up; for example,
> yesterday at 06:25:32 there were 172 of them, at 06:25:33 there were
> 457, and at 06:25:34 there were 475.  These message are in /var/log/ 
> auth.log.
> Moreover the binary log file in /var/lib/heimdal-kdc/log seems to grow
> enormously large on the slave machines, filling the 4gb partition in
> a few days.  Moreover, on the master machine, the ipropd-master  
> process
> keeps getting killed by the kernel, which logs this message in
> /var/log/kern.log:
>
>      kernel: Out of memory: Killed process 1545 (ipropd-master)
>
> Since the database syncronisation gets lost so frequently, I have
> cron jobs which check every ten minutes on all machines and
> restart the iprop master or slave processes, at least if they are
> not running; for example, this script gets run every ten minutes
> on the slave:
>
>      if [ ! -r /var/run/heimdal-kdc.pid ];then exit;fi
>      if [ ! -r /proc/`cat /var/run/heimdal-kdc.pid`/stat ];then  
> exit 0;fi
>      if [ -r /var/run/ipropd-slave.pid ] ; then
>        if [ -r /proc/`cat /var/run/ipropd-slave.pid`/stat ];then  
> exit 0;fi
>      fi
>      . /etc/default/heimdal-kdc
>      start-stop-daemon --start --quiet --background --make-pidfile  
> --pidfile /var/run/ipropd-slave.pid --exec /usr/sbin/ipropd-slave  
> -- "$SLAVE_PARAMS"
>      exit $?
>
> Anyway, the situation is getting worse, and so I decided to backport
> the available heimdal 0.8.1 Debian packages to etch and to try those.
> Now the master iprop process is dying without giving an error message,
> but the logs are filling up with messages like this:
>
>      ipropd-master[5151]: send_diffs: failed to find previous  
> entry: kadm5_log_previous: log entry have consistency failure,  
> length wrong
>
> Clearly, whatever else, neither version of iprop is succeeding in  
> playing
> the log messages properly on the slaves.  Has anyone any insight to  
> offer
> before I try reporting this as a bug on the Debian lists?  Am I  
> missing
> something obvious?
>
>      -- Owen

------------------------------------------------------------------------
The opinions expressed in this message are mine,
not those of Caltech, JPL, NASA, or the US Government.
Henry.B.Hotz@jpl.nasa.gov, or hbhotz@oxy.edu