FS#14467 - [man-db] mandb always recreates the database since man-pages 3.20

Attached to Project: Arch Linux
Opened by Matthias Dienstbier (fs4000) - Sunday, 26 April 2009, 20:08 GMT
Last edited by Andreas Radke (AndyRTR) - Sunday, 04 October 2009, 17:19 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Andreas Radke (AndyRTR)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 17
Private No

Details

Description:
Since man-pages 3.20 mandb does not recognize that the database is up-to-date and does not need to be updated. Hence it always recreates the database which consumes quite an amount of time.

Additional info:
* package version(s)
man-db 2.5.5-1
man-pages 3.20-2

Steps to reproduce:
run mandb twice and watch it doing everything twice
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Sunday, 04 October 2009, 17:19 GMT
Reason for closing:  Fixed
Comment by Matthias Dienstbier (fs4000) - Sunday, 26 April 2009, 20:15 GMT
Perhaps it is needed to force the rebuid of the database once (mandb -c) before this issue appears.
Comment by Bernd Pol (bernarcher) - Sunday, 26 April 2009, 21:19 GMT
Just testet it. Rebuilding the database does not help. mandb still thinks it has to be rebuilt.
Comment by Matthias Dienstbier (fs4000) - Sunday, 26 April 2009, 21:25 GMT
I did not say that rebuilding helps. Downgrading man-pages to a version less than 3.20 does.
BTW: 3.20 is outdated, too. I marked it in the repo. Perhaps this also fixes that.
Comment by Attila (attila) - Tuesday, 28 April 2009, 17:49 GMT
I can comfirm this problem. I do a test with man-pages 3.21 without the poxis man pages and all works fine.

Question: Is there a reason why we include the poxis man pages instead of seperates packages?

Note: In the cronjob i put a "nice -n 19 ionice -c 3" before "/usr/bin/mandb --quiet" which is the same as opensuse do it in their cronjob.
Comment by Colin Watson (cjwatson) - Wednesday, 13 May 2009, 10:20 GMT
Please run mandb with --debug (and without --quiet) and attach the output.
Comment by Matthias Dienstbier (fs4000) - Wednesday, 13 May 2009, 13:34 GMT
mandb -d > mandb.log 2>&1
Comment by Colin Watson (cjwatson) - Friday, 15 May 2009, 15:28 GMT
OK. Could you now attach the following three files:

/var/cache/man/index.db
/usr/share/man/man3/acoshl.3p.gz
/usr/share/man/man3/acosh.3p.gz

(I think I know what's happening now; acoshl(3p) is apparently a .so reference to acosh(3p) but mandb is only recording acosh(3p) as a "ghost" whatis reference, and then getting confused when it sees it as a real file. However, I don't have an Arch Linux system myself to test this directly, and hope that the above files will give me enough material to reproduce this directly and test a fix.)
Comment by Matthias Dienstbier (fs4000) - Friday, 15 May 2009, 22:02 GMT
Thanks for helping
   man.tar.bz2 (509.3 KiB)
Comment by Andreas Radke (AndyRTR) - Sunday, 14 June 2009, 21:27 GMT
Colin, any progress?
Comment by Colin Watson (cjwatson) - Sunday, 21 June 2009, 01:55 GMT
It took me a while to reproduce this since it turned out I needed to have acosh.3.gz as well as acosh.3p.gz, and it seemed to depend on the exact order in which mandb processed these files. A simplified manual page tree sufficient to reproduce the problem is acosh(3) as a real page, acosh(3p) as a real page, and acoshl(3p) as a .so link pointing to acosh(3p), and a version of mandb hacked to process the pages in that order regardless of what readdir() returns. A detailed technical explanation follows. First, Andreas, here's the change I believe you need to fix this bug (you'll see from the revision history that it took me a few goes, largely because it's very late at night here):

http://bzr.savannah.gnu.org/lh/man-db/trunk/revision/1070?compare_revid=1067

What's happening here is twofold, but it stems from a single bug. When doing exact-section database lookups, man-db was incorrectly returning all results for which the section found in the database is a prefix of that which was requested. In other words, when looking for exactly acosh(3p), it was returning acosh(3) as well. This sounds fairly minor, but there are two consequences of this:

1) When mandb processes acosh(3p), it does an exact-section database lookup to check whether it's already seen it. This returns acosh(3) which it just processed earlier and it decides it doesn't need to do any more. Then it processes acoshl(3p), which causes it to insert an entry for that, but also causes it to insert a ghostly "whatis" entry for acosh(3p), which are mostly there to attempt to support people writing extra names into the NAME section of manual pages but not installing links on the filesystem as they should. Normally this whatis entry would be discarded because there would already be a real-page entry which takes precedence, but that didn't happen for the reason just given, so the database ends up storing: acosh(3) real page, acosh(3p) whatis, acoshl(3p) link.

2) When mandb is run a second time, it looks for any modified directories, and has a quick look through them to see if anything changed. One of the things it checks is whether any of those ghostly whatis entries appear to have turned into real pages, which normally means that it needs to scan that page to update its records. In this case, it looks up acosh(3p) with another one of those exact-section database lookups, which as mentioned are not very reliable, although in this case it sort of gets lucky because there is a real acosh(3p) there now. But - oh dear, it's a whatis entry, and yet there's an acosh.3p.gz file on the filesystem. Better rescan to see what it contains ...

As far as I can tell, my patch fixes this. Please let me know if there's still a problem of this nature after Andreas or one of the other Arch developers applies it. Thanks for your report and the debugging information, and sorry I took a while to deal with it!
Comment by Thomas Bächler (brain0) - Sunday, 21 June 2009, 09:38 GMT
I just applied that patch and built this locally (had to remove the Changelog hunk though) - running mandb twice made the second run take only a second or so:

0 man subdirectories contained newer manual pages.
0 manual pages were added.
0 stray cats were added.
0 old database entries were purged.

EDIT: I took the liberty of putting an updated package to testing.
Comment by Andreas Radke (AndyRTR) - Sunday, 21 June 2009, 12:29 GMT
looks solved also on my system. Thanks Colin.
Comment by Arthur (rrto) - Thursday, 10 September 2009, 08:33 GMT
Comment by Arthur (rrto) - Thursday, 10 September 2009, 08:33 GMT
  • Field changed: Percent Complete (100% → 0%)
Comment by Colin Watson (cjwatson) - Thursday, 10 September 2009, 08:59 GMT
Allan, can you give more details about why you reopened this?

As I said in a previous comment: for problems like this, please run mandb with --debug (and without --quiet) and attach the output.
Comment by Bernd Pol (bernarcher) - Thursday, 10 September 2009, 12:53 GMT
There appears to be a regression with the latest 2.5.6-1 update.
Mandb does build the full database again every time it has been called.
See this forum thread: http://bbs.archlinux.org/viewtopic.php?pid=616872
Comment by Bernd Pol (bernarcher) - Thursday, 10 September 2009, 13:06 GMT
Debug log attached. mandb has been called immediately after a previous mandb run.
Comment by Colin Watson (cjwatson) - Thursday, 10 September 2009, 13:26 GMT
OK. Could you now also attach the following three files:

/var/cache/man/index.db
/usr/share/man/man5/modprobe.conf.5.gz
/usr/share/man/man5/modprobe.d.5.gz
Comment by Bernd Pol (bernarcher) - Thursday, 10 September 2009, 19:14 GMT
Attached.
Comment by Bernd Pol (bernarcher) - Thursday, 10 September 2009, 19:19 GMT
Sorry, had to resolve permissions first.
Comment by Bernd Pol (bernarcher) - Thursday, 10 September 2009, 19:22 GMT
Argh! File too big.
Comment by Colin Watson (cjwatson) - Friday, 11 September 2009, 08:14 GMT
Thanks. This is a new cause of the same symptom, not as far as I can see a regression from the previous fix.

The proper fix in man-db is a little involved, and I'll have to think about it, but at least I have a test case now. In the meantime, your modprobe.d.5.gz is buggy anyway and ought to be fixed. I suggest that it be corrected as follows:

-.so modprobe.conf.5
+.so man5/modprobe.conf.5

(.so includes should always be relative to the *base* of the manual hierarchy, e.g. /usr/share/man in this case.) Fixing this will make this particular cause go away, although of course it's possible that there are other similar pages ...
Comment by Evangelos Foutras (foutrelis) - Friday, 11 September 2009, 12:01 GMT
The correction to modprobe.d.5.gz suggested by Colin fixes the issue for me. :)
Comment by Bernd Pol (bernarcher) - Friday, 11 September 2009, 12:30 GMT
Works here as well now. Thanks, Colin!
Comment by Aaron Griffin (phrakture) - Tuesday, 22 September 2009, 18:54 GMT
Is there anything we can do globally to fix this?
Comment by Andreas Radke (AndyRTR) - Sunday, 04 October 2009, 17:12 GMT
Mailed the module-init-tools project maintainer about their broken man-page and I'm going to fix it for ArchLinux now.

As Collin is aware of this bug we can close this one I think.
Comment by Andreas Radke (AndyRTR) - Sunday, 04 October 2009, 17:19 GMT
When we will find more broken links in man pages please open one bug per page and assign it to me + the package maintainer. Closing this one now.

Loading...