FS#59238 - [samba] 4.8.3-1 reports protocol errors with ldb-1.4.0-1

Attached to Project: Arch Linux
Opened by Mike Knowles (mikek) - Thursday, 05 July 2018, 12:46 GMT
Last edited by Tobias Powalowski (tpowa) - Thursday, 12 July 2018, 07:50 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Tobias Powalowski (tpowa)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Description:
A freshly installed system with samba causes error messages (and fails to start the internal Kerberos server) following a "samba-tool domain provision ..." command.

Additional info:
* affected packages believed to be samba-4.8.3-1 / ldb-1.4.0-1

Steps to reproduce:
I freshly installed a brand new Arch Linux system using only the base (required packages) plus those packages required by samba (primarily smbclient, libwbclient and ldb).

Rebooted, then logged in as root and ran "samba-tool domain provision --use-rfc2307 --interactive".
I accepted defaults to all questions except realm (prefixed my uppercased dns domain name with "AD." and accepted "AD" as workgroup) and set and confirmed a password for it.
This completes with the well known warning about "Unable to determine the DomainSID, ..." (exit status 0 - fine).
I then copied /var/lib/samba/private/krb5.conf to /etc.

Now try "systemctl start samba.service" and loads of messages similar to the following start appearing in the system log:

Starting Samba AD Daemon...
samba version 4.8.3 started.
...
ldb: Failed to lock db: ../ldb_tdb/ldb_tdb.c:147: Reusing ldb opend by pid 754 in process 757
/ Protocol error for CN=Configuration,DC=ad,DC=mydomainname
...

I believe it is likely to be a mismatch of versions between samba-4.8.3 and either ldb-1.3.4 or ldb-1.4.0. pacman -Syu currently installs samba-4.8.3-1 with ldb-1.4.0-1 but I *think* the ldb that samba's after may be 1.3.4, which is not available to me (not via pacman anyway).

When samba is started it appears the Kerberos server is not running (no ports 88 & 464 listed in netstat) so "kinit administrator" fails.

I have downloaded the full samba 4.8.3 source tarball, built and run it on a test system and THAT appears to be okay. I think it's merely a build issue with the current versions, but it is preventing me doing anything with active directory stuff at present. I think this problem arrived with samba-4.8.3-1 but, since I never installed the 4.8.2-[12] versions, I cannot be sure.

Mike
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Thursday, 12 July 2018, 07:50 GMT
Reason for closing:  Fixed
Additional comments about closing:  ldb-1:1.3.4-2
samba-3.8.3-2
Comment by Mike Knowles (mikek) - Thursday, 05 July 2018, 16:33 GMT
Whilst reading the wiki pages about the Arch Build System (svn/git checkout, etc) I looked at the downloads area on the samba.org site.
Going by the date/timestamps on ldb-1.3.4.tar.gz & ldb-1.4.0.tar.gz, the LATEST updated one is 1.3.4 NOT 1.4.0 - maybe significant ?

pacman -Syu installs 1.4.0 at present - maybe it should use 1.3.4 instead ?

Mike
Comment by loqs (loqs) - Thursday, 05 July 2018, 16:56 GMT
You can find the older packages to test on https://wiki.archlinux.org/index.php/Arch_Linux_Archive
If you extract the samba-4.8.3-1-x86_64.pkg.tar.xz and inspect the .BUILDINFO ldb-1.4.0-1-x86_64 was used it was also used for samba-4.8.2-2-x86_64.pkg.tar.xz
If you locally build with the PKGBUILD does that work? If you change the PKGBUILD to remove !ldb from the --bundled-libraries option to configure does that work?
Comment by Brenden (drzaeus77) - Thursday, 05 July 2018, 20:39 GMT
I have the same problem, and was able to compile my own package that seems to work. To do so, I had to first compile ldb 1.3.4 from source, as the online archive skips from 1.3.3 to 1.4.0. After doing that, a standard makepkg of samba 4.8.3 (without changing --bundled-libraries) compiled against ldb 1.3.4 runs without errors.
Comment by Mike (quailman38) - Thursday, 05 July 2018, 20:40 GMT
I tried downgrading to 4.8.2-2 but ran into dependency issues. I can confirm this is happening, would really like this resolved ASAP!
Comment by Mike Knowles (mikek) - Friday, 06 July 2018, 10:11 GMT
Right, I reckon I know how to fix this having just rebuilt samba and ldb packages to prove the point. I did the following:

mkdir /tmp/build; cd /tmp/build # I used /tmp as it's actually a 16GB filesystem on 2.8GHz DDR4, supported by a decent UPS, so why wouldn't you ? :-)

asp checkout ldb samba

cd ldb/repos/extra-x86_64/

vi PKGBUILD
# change line "pkgver=1.4.0" to "pkgver=1.3.4"
# change line "md5sums=('9c2fb7f03cb6695e5b78a99af86d9bf4'" to "md5sums=('0279ff75049c26e839b66f9e9daf7120'"
# ZZ

makepkg -si

cd ../../../samba/repos/extra-x86_64/

# in my case, I added " -j4" to the end of line 83 but otherwise left PKGBUILD completely unaltered

makepkg -si

# this installs a newly built ldb-1.3.4-1 and a REBUILT (i.e. to use ldb 1.3.4 NOT 1.4.0) samba-4.8.3-1 (no doubt Arch will build and release ldb-1.3.4-1 & samba-4.8.3-2 to achieve this).


I now installed these four packages (samba/smbclient/libwbclient/ldb and their dependencies) on a FRESHLY BUILT VirtualBox'd system (just the base packages + these),
rebooted, logged in as root, set LDB_MODULES_PATH=/usr/lib/samba/ldb and ran "samba domain provision --use-rfc2307 --interactive" - that worked fine as usual.

Now copied /var/lib/samba/private/krb5.conf to /etc, enabled and started samba.service - and THAT runs just fine too. You get the usual messages about dnsupdate and, in my case,
warnings about failing to bind to ipv6, etc, but there's no guff about ldb failing to lock db and protocol errors, etc. kinit administrator does what it should do, smbclient -L localhost
returns sensible info, fine.

Now, can we assign someone to get this built and re-issued so we can tick this one off the todo list, yes ?

Before I finish, I did note a couple of things in passing:

- it appears the 1.4.0 build of ldb results in four files which weren't in my 1.3.4 version for some reason (they are PROBABLY legitimate differences between the two versions but I can't be sure):
/usr/lib/ldb/libldb-key-value.so
/usr/lib/ldb/libldb-mdb-int.so
/usr/lib/ldb/modules/ldb/ldb.so
/usr/lib/ldb/modules/ldb/mdb.so

- the old smbclient package had a file /usr/lib/samba/libcmocka-samba4.so (which didn't get built in mine - not sure at all about this one ?)

- finally, I had to bung the jansson package on my VirtualBox test system because the samba build evidently configure'd and built having spotted that package on my build system
(rather like the comment in the samba PKGBUILD file about uninstalling dmapi first, I suspect this may need excluding too to avoid catching others out)

Other than that - it's a straightforward fix (AS LONG AS NOTHING ELSE REQUIRES ldb-1.4.0-1 !)
Obviously, I don't have authority to make such changes to the repos/archive/whatever to properly fix this, but I think sufficient information is provided above to get it done, yes ?

Just to double check, I repeated the above but with an UNALTERED ldb (i.e. build 1.4.0 again) and see what happens. Reinstalled the VirtualBox system from scratch (reformat disk, install base, etc)
with that and, guess what, it fails: kinit can't find KDC for realm, smbclient -L localhost bitches about NT_STATUS_INVALID_SID, etc.

So, can we please stop using ldb 1.4.0 until whatever's wrong between it and samba is put right again ?

Mike
Comment by Mike Knowles (mikek) - Friday, 06 July 2018, 11:41 GMT
Brenden (drzaeus77) & Mike (quailman38),

Thanks for your observations. At least it confirms I'm not the only one seeing this problem, though earlier in the week (having searched Google, bug tracking sites, forums, etc) it began to LOOK like that :-)

Mike
Comment by Mike (quailman38) - Friday, 06 July 2018, 13:44 GMT
I was able to manually build the package using the right LDB version above, and it does indeed allow the samba services to start. However, a previously working share is no longer working with this setup (netvol/sysvol work fine though). Is anyone else having problems accessing shares from Windows 10 machines on this version?
Comment by Mike Knowles (mikek) - Saturday, 07 July 2018, 11:58 GMT
Mike (quailman38):

If I understand you correctly ("the right LDB version" == 1.4.0 ?), you got samba-4.8.3-1 built with ldb-1.4.0-1 and STARTED it successfully, yes ?

I could build and start that alright, and superficially it appeared okay when looking at a subsequent process listing, but netstat revealed it had failed to start Kerberos (no ports 88 and 464, udp or tcp).
As a result, it failed even the most basic tests in https://wiki.samba.org/index.php/Setting_up_Samba_as_an_Active_Directory_Domain_Controller#Testing_your_Samba_AD_DC. This is with a pretty much default "samba-tool domain provision" command and starting samba.service.

Did you manage to get THAT (samba-4.8.3 built with ldb-1.4.0) up and running with a working (internal to samba) Kerberos ? If so, do tell me how because the straight "asp checkout ldb samba" / "makepkg -si" commands fail to do so for me. I tried the suggested "remove this", "disable that" and nothing I tried got it to work with ldb-1.4.0-1.

As for Windows 10, no, sorry - my current problems with samba are purely concerned with the fact that (with ldb-1.4.0-1) it won't even talk to ITSELF - part of the reason I raised this as high severity. If a package fails even basic internal/installation tests on itself, it stands no chance at all of operating reliably in a live environment. The fact that the system log gets filled with loads of ldb protocol errors suggests to me either that 4.8.3 and 1.4.0 are versions that are not supposed to be run together, or something fundamental in the protocol changed between 1.3.x and 1.4.0 and samba 4.8.3 wasn't ready for that. Whatever the story, it appears just using 1.3.4 makes those problems disappear. I suspect other Linux distributions MAY not have hit this (YET) because of their more conventional release schedules (i.e. NOT rolling release), which may well explain why I searched and found nobody else talking about it. I have no idea how large or small the Arch Linux user base might be and, within that, how many are using it in a Windows AD configuration (AND had just updated their packages in the last few weeks). It may very well be that, given samba can be a complicated beast to get working EXACTLY as you want (for an existing Windows community with various potentially quirky requirements to satisfy), most of the existing installations out there are governed by a policy of "right, it works fine now - don't dick around with it, and that means NO changes unless they're fixes for things we REALLY can't live with !" I just had a quick look at Mageia 6's existing core release/update contents (I distribution I used to use until a few months ago) - it is still using samba-4.6.12 and ldb-1.1.29.

Repository Maintainers:

As I identified at length in my findings yesterday (Friday, 06 July 2018, 10:11 GMT), samba-4.8.3 built with ldb-1.3.4 and installed on a clean machine appears to work just fine, whilst the existing samba-4.8.3-1 & ldb-1.4.0-1 do not. No-one has yet shown me that samba 4.8.3 works with ldb 1.4.0, which begs the following question:

Did anyone check the system logs after it was built/installed/run for the first time ? The reason I ask is that this combination fails with even the most basic test configuration detailed on the samba wiki site, which is clearly designed as an initial "establish confidence nothing's gone spectacularly wrong" test. There is absolutely NOTHING complicated about the environment in which I performed MY testing on this. As I say, it was done on an absolutely CLEAN, newly installed system consisting of base packages + samba and associated dependencies - nothing else !

Mike
Comment by loqs (loqs) - Saturday, 07 July 2018, 12:14 GMT
@mikek it has not been assigned yet so the package maintainer may not be aware of the issue yet.
While you are waiting for a response you could try contacting upstream and enquire if samba 4.8.3 and ldb 1.40 is supported / expected to work and if upstream believes it is a packaging or upstream issue.
https://www.samba.org/samba/irc.html / https://www.samba.org/samba/archives.html
Comment by Mike Knowles (mikek) - Saturday, 07 July 2018, 13:06 GMT
@loqs - "not been assigned yet": yep, fair comment - I've seen the list of open bugs (some 127 high or critical !), a few of which have been around for months. I understand they're likely to be quite busy.
I will get in touch with the people looking after samba to ask the obvious question though.

Mike
Comment by Zep Man (Zepman) - Thursday, 12 July 2018, 07:04 GMT
More background information.

Initial bug symptoms:
https://www.spinics.net/lists/samba/msg150771.html

Bug recognized by Samba:
https://bugzilla.samba.org/show_bug.cgi?id=13519
https://www.spinics.net/lists/samba/msg150983.html

From what I read, the build of Samba 4.8.3 with ldb >=1.4.0 should have been blocked, but it did not since a blocker was not included.

ldb 1.4.0 was added on 2018-06-22 to Arch Linux. Downgrading to an Arch Linux snapshot of 2018-06-21 temporarily solved this issue for me, based on these instructions:
https://wiki.archlinux.org/index.php/Arch_Linux_Archive#How_to_restore_all_packages_to_a_specific_date

Loading...