FS#65048 - [ceph-mgr] Dependency: python-yaml is missing

Attached to Project: Community Packages
Opened by likeit (likeit) - Saturday, 04 January 2020, 19:17 GMT
Last edited by Thore Bödecker (foxxx0) - Wednesday, 15 January 2020, 07:38 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To Thore Bödecker (foxxx0)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

The python-yaml dependency is missing!

Steps to reproduce:

execute: system start ceph-mgr@*.service

Jan 04 20:13:14 node2 ceph-mgr[514]: File "/usr/share/ceph/mgr/k8sevents/module.py", line 28, in <module>
Jan 04 20:13:14 node2 ceph-mgr[514]: import yaml
Jan 04 20:13:14 node2 ceph-mgr[514]: ModuleNotFoundError: No module named 'yaml'
Jan 04 20:13:14 node2 ceph-mgr[514]: 2020-01-04 20:13:14.993 7f44b14abd40 -1 mgr[py] Class not found in module 'k8sevents'
Jan 04 20:13:14 node2 ceph-mgr[514]: 2020-01-04 20:13:14.993 7f44b14abd40 -1 mgr[py] Error loading module 'k8sevents': (2) No such file or directory
Jan 04 20:13:18 node2 ceph-mgr[514]: 2020-01-04 20:13:18.343 7f44b14abd40 -1 log_channel(cluster) log [ERR] : Failed to load ceph-mgr modules: k8sevents
Jan 04 20:13:19 node2 ceph-mgr[514]: [04/Jan/2020:20:13:19] ENGINE Bus STARTING
Jan 04 20:13:19 node2 ceph-mgr[514]: [04/Jan/2020:20:13:19] ENGINE Serving on https://:::8443
Jan 04 20:13:19 node2 ceph-mgr[514]: [04/Jan/2020:20:13:19] ENGINE Bus STARTED
Jan 04 20:13:41 node2 ceph-mgr[514]: 2020-01-04 20:13:41.756 7f448b1cd700 -1 client.0 error registering admin socket command: (17) File exists
This task depends upon

Closed by  Thore Bödecker (foxxx0)
Wednesday, 15 January 2020, 07:38 GMT
Reason for closing:  Fixed
Additional comments about closing:  fixed as of ceph-mgr-14.2.6-1
Comment by Thore Bödecker (foxxx0) - Saturday, 04 January 2020, 23:19 GMT
There are now ceph packages with version 14.2.5-2 in [community-testing].
The ceph-mgr package with that version contains python-yaml as a dep.

Please test it and report back if that solves the outstanding issues with ceph-mgr.
Comment by likeit (likeit) - Sunday, 05 January 2020, 08:20 GMT
The python-yaml fix the mgr startup but the dashboard seems not to work:

Jan 05 09:09:44 node2 ceph-mgr[9912]: ::ffff:192.168.2.127 - - [05/Jan/2020:09:09:44] "GET / HTTP/1.1" 404 604 "" "Mozilla/5.0 (X11; Linux x86_64; rv:71.0) Gecko/20100101 Firefox/71.0"
Jan 05 09:09:48 node2 ceph-mgr[9912]: Exception in thread Thread-1:
Jan 05 09:09:48 node2 ceph-mgr[9912]: Traceback (most recent call last):
Jan 05 09:09:48 node2 ceph-mgr[9912]: File "/lib/python3.8/threading.py", line 932, in _bootstrap_inner
Jan 05 09:09:48 node2 ceph-mgr[9912]: self.run()
Jan 05 09:09:48 node2 ceph-mgr[9912]: File "/usr/share/ceph/mgr/volumes/fs/volume.py", line 100, in run
Jan 05 09:09:48 node2 ceph-mgr[9912]: self.function(*self.args, **self.kwargs)
Jan 05 09:09:48 node2 ceph-mgr[9912]: File "/usr/share/ceph/mgr/volumes/fs/volume.py", line 118, in cleanup_connections
Jan 05 09:09:48 node2 ceph-mgr[9912]: idle_fs = [fs_name for fs_name,conn in self.connections.iteritems()
Jan 05 09:09:48 node2 ceph-mgr[9912]: AttributeError: 'dict' object has no attribute 'iteritems'

If you navigate to the dashboard following traceback will occur:

traceback": "Traceback (most recent call last):
File \"/lib/python3.8/site-packages/cherrypy/lib/static.py\", line 58, in serve_file
st = os.stat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/usr/share/ceph/mgr/dashboard/frontend/dist/en-US/index.html'


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File \"/lib/python3.8/site-packages/cherrypy/_cprequest.py\", line 638, in respond
self._do_respond(path_info)
File \"/lib/python3.8/site-packages/cherrypy/_cprequest.py\", line 697, in _do_respond
response.body = self.handler()
File \"/lib/python3.8/site-packages/cherrypy/lib/encoding.py\", line 219, in __call__
self.body = self.oldhandler(*args, **kwargs)
File \"/lib/python3.8/site-packages/cherrypy/_cptools.py\", line 230, in wrap
return self.newhandler(innerfunc, *args, **kwargs)
File \"/usr/share/ceph/mgr/dashboard/services/exception.py\", line 88, in dashboard_exception_handler
return handler(*args, **kwargs)
File \"/lib/python3.8/site-packages/cherrypy/_cpdispatch.py\", line 54, in __call__
return self.callable(*self.args, **self.kwargs)
File \"/usr/share/ceph/mgr/dashboard/controllers/home.py\", line 108, in __call__
return serve_file(full_path)
File \"/lib/python3.8/site-packages/cherrypy/lib/static.py\", line 65, in serve_file
raise cherrypy.NotFound()
cherrypy._cperror.NotFound: (404, \"The path '/' was not found.\")\n


As you could see below there is no folder 'en-US'.

admin@node2 ..share/ceph/mgr/dashboard/frontend/dist % ls -l
total 8292
-rw-r--r-- 1 root root 232374 Jan 4 22:45 2.fc3177df96823cfd76d3.js
-rw-r--r-- 1 root root 83893 Jan 4 22:45 3rdpartylicenses.txt
-rw-r--r-- 1 root root 211784 Jan 4 22:45 6.2f6d8d0e6ecbbf8a2c37.js
-rw-r--r-- 1 root root 158902 Jan 4 22:45 7.939f0ac720f007a3044d.js
-rw-r--r-- 1 root root 145098 Jan 4 22:45 8.f953195a5ae6e7408603.js
-rw-r--r-- 1 root root 46013 Jan 4 22:45 9.c5cb7854634f3cb0f9b4.js
-rw-r--r-- 1 root root 12949 Jan 4 22:45 Ceph_Logo_Stacked_RGB_120411_fa_228x228.1ed169ccc35367a2dab2.png
drwxr-xr-x 2 root root 4096 Jan 5 09:08 assets
-rw-r--r-- 1 root root 2508 Jan 4 22:45 common.8a53d98b04768bd15706.js
-rw-r--r-- 1 root root 1150 Jan 4 22:45 favicon.ico
-rw-r--r-- 1 root root 188756 Jan 4 22:45 forkawesome-webfont.fc46f3dae03b2b2e1cee.ttf
-rw-r--r-- 1 root root 91624 Jan 4 22:45 forkawesome-webfont.3a9e014c2469ffa65a0e.woff2
-rw-r--r-- 1 root root 188946 Jan 4 22:45 forkawesome-webfont.35e77a38ca9d85c4e897.eot
-rw-r--r-- 1 root root 115148 Jan 4 22:45 forkawesome-webfont.44bbdbbfb5a10ba2d1ce.woff
-rw-r--r-- 1 root root 480784 Jan 4 22:45 forkawesome-webfont.78dcc9c4999659b8026a.svg
-rw-r--r-- 1 root root 45404 Jan 4 22:45 glyphicons-halflings-regular.e18bbf611f2a2e43afc0.ttf
-rw-r--r-- 1 root root 20127 Jan 4 22:45 glyphicons-halflings-regular.f4769f9bdb7466be6508.eot
-rw-r--r-- 1 root root 23424 Jan 4 22:45 glyphicons-halflings-regular.fa2772327f55d8198301.woff
-rw-r--r-- 1 root root 18028 Jan 4 22:45 glyphicons-halflings-regular.448c34a56d699c29117a.woff2
-rw-r--r-- 1 root root 108738 Jan 4 22:45 glyphicons-halflings-regular.89889688147bd7575d63.svg
-rw-r--r-- 1 root root 1162 Jan 4 22:45 index.html
-rw-r--r-- 1 root root 5748734 Jan 4 22:45 main.b69c6a0ae3d8f949ed36.js
-rw-r--r-- 1 root root 101651 Jan 4 22:45 polyfills.f31db31652a3fd9f4bca.js
-rw-r--r-- 1 root root 2777 Jan 4 22:45 prometheus_logo.074db273ef932a67d91b.svg
-rw-r--r-- 1 root root 2331 Jan 4 22:45 runtime.b5a5b72201aafa6024c9.js
-rw-r--r-- 1 root root 210110 Jan 4 22:45 scripts.fc88ef4a23399c760d0b.js
-rw-r--r-- 1 root root 186408 Jan 4 22:45 styles.f5317b15474518dffebc.css
Comment by Thore Bödecker (foxxx0) - Friday, 10 January 2020, 12:50 GMT
Thanks for the quick an detailed feedback, I have identified an issue in the upstream build tooling and found a workaround with some of the mgr dashboard devs.
I've just kicked off another build for version 14.2.6 that will hopefully go through and land in [community-testing] again.
Once it has completed I'll post another comment here.
Comment by Thore Bödecker (foxxx0) - Friday, 10 January 2020, 17:25 GMT
ceph{,-libs,-mgr}-14.2.6-1 are now in [community-testing].
This should fix the remaining mgr-dashboard issues, the corresponding run-tox-mgr-dashboard test passed successfully.

Please give them a test and report back.
Comment by likeit (likeit) - Saturday, 11 January 2020, 09:55 GMT
The Dashboard seems to work now. But there is still some problem left (upstream bug?):

# systemctl status ceph-mgr@m2.service
--------------------------------------
Jan 11 10:44:25 node2 ceph-mgr[512]: 2020-01-11 10:44:25.478 7fcba3ddd700 -1 client.0 error registering admin socket command: (17) File exists
Jan 11 10:45:36 node2 ceph-mgr[512]: Exception in thread Thread-1:
Jan 11 10:45:36 node2 ceph-mgr[512]: Traceback (most recent call last):
Jan 11 10:45:36 node2 ceph-mgr[512]: File "/lib/python3.8/threading.py", line 932, in _bootstrap_inner
Jan 11 10:45:36 node2 ceph-mgr[512]: self.run()
Jan 11 10:45:36 node2 ceph-mgr[512]: File "/usr/share/ceph/mgr/volumes/fs/volume.py", line 100, in run
Jan 11 10:45:36 node2 ceph-mgr[512]: self.function(*self.args, **self.kwargs)
Jan 11 10:45:36 node2 ceph-mgr[512]: File "/usr/share/ceph/mgr/volumes/fs/volume.py", line 118, in cleanup_connections
Jan 11 10:45:36 node2 ceph-mgr[512]: idle_fs = [fs_name for fs_name,conn in self.connections.iteritems()
Jan 11 10:45:36 node2 ceph-mgr[512]: AttributeError: 'dict' object has no attribute 'iteritems'
Comment by Thore Bödecker (foxxx0) - Monday, 13 January 2020, 16:48 GMT
i've been trying to replicate your error in my test cluster but haven't managed to achieve it yet.

could you post the "enabled_modules" from the `ceph mgr module ls` output and probably a `ceph status` output too?

otherwise the ceph 14.2.6-1 packages in [community-testing] have been undergoing some further testing by myself and i haven't been able to find any issues with them.
the other locales for the mgr dashboard seem to work flawlessly too
Comment by likeit (likeit) - Monday, 13 January 2020, 17:10 GMT
Sure

# ceph mgr module ls
--------------------
{
"enabled_modules": [
"dashboard",
"iostat",
"pg_autoscaler",
"restful",
"zabbix"
],
...
}

# ceph status
--------------------
cluster:
id: ...
health: HEALTH_ERR
1 nearfull osd(s)
4 pool(s) nearfull
1 pools have many more objects per pg than average
Failed to send data to Zabbix
2 scrub errors
Low space hindering backfill (add storage if this doesn't resolve itself): 12 pgs backfill_toofull
Possible data damage: 2 pgs inconsistent
Degraded data redundancy: 100391/70448353 objects degraded (0.143%), 6 pgs degraded, 6 pgs undersized
86 pgs not deep-scrubbed in time
105 pgs not scrubbed in time
mon node1 is low on available space

services:
mon: 2 daemons, quorum node1,node2 (age 9h)
mgr: m2(active, since 2d), standbys: m1
mds: cephfs_secure:1 {0=b=up:active} 1 up:standby
osd: 17 osds: 17 up (since 2d), 17 in (since 5d); 12 remapped pgs

data:
pools: 4 pools, 966 pgs
objects: 14.54M objects, 26 TiB
usage: 41 TiB used, 15 TiB / 57 TiB avail
pgs: 100391/70448353 objects degraded (0.143%)
1064610/70448353 objects misplaced (1.511%)
951 active+clean
6 active+undersized+degraded+remapped+backfill_toofull
6 active+remapped+backfill_toofull
2 active+clean+inconsistent
1 active+clean+scrubbing+deep
Comment by Thore Bödecker (foxxx0) - Wednesday, 15 January 2020, 07:34 GMT
Strange, I would advise you to get in contact with upstream yourself, either in the #ceph channel on oftc or via the ceph-users mailing list.
My test cluster has now reached a HEALTH_OK state again after being (purposely) in a very broken state.
Thus I assume the 14.2.6-1 packages are working as they should and will be moved out of [testing] soon.

I'm closing this issue as resolved since the original errors have since been fixed.

Loading...