-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[microceph][ubuntu] Add changes for microceph collections #3291
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,62 +31,77 @@ class CephOSD(Plugin, RedHatPlugin, UbuntuPlugin): | |
plugin_name = 'ceph_osd' | ||
profiles = ('storage', 'virt', 'container') | ||
containers = ('ceph-(.*-)?osd.*',) | ||
files = ('/var/lib/ceph/osd/', '/var/lib/ceph/*/osd*') | ||
files = ('/var/lib/ceph/osd/', '/var/lib/ceph/*/osd*', | ||
'/var/snap/microceph/common/data/osd/*') | ||
|
||
def setup(self): | ||
|
||
self.add_file_tags({ | ||
"/var/log/ceph/(.*/)?ceph-(.*-)?osd.*.log": 'ceph_osd_log', | ||
}) | ||
|
||
self.add_forbidden_path([ | ||
"/etc/ceph/*keyring*", | ||
"/var/lib/ceph/**/*keyring*", | ||
# Excludes temporary ceph-osd mount location like | ||
# /var/lib/ceph/tmp/mnt.XXXX from sos collection. | ||
"/var/lib/ceph/**/tmp/*mnt*", | ||
"/etc/ceph/*bindpass*" | ||
]) | ||
|
||
# Only collect OSD specific files | ||
self.add_copy_spec([ | ||
"/run/ceph/**/ceph-osd*", | ||
"/var/lib/ceph/**/kv_backend", | ||
"/var/log/ceph/**/ceph-osd*.log", | ||
"/var/log/ceph/**/ceph-volume*.log", | ||
]) | ||
|
||
self.add_cmd_output([ | ||
"ceph-disk list", | ||
"ceph-volume lvm list" | ||
]) | ||
|
||
cmds = [ | ||
"bluestore bluefs available", | ||
"config diff", | ||
"config show", | ||
"dump_blacklist", | ||
"dump_blocked_ops", | ||
"dump_historic_ops_by_duration", | ||
"dump_historic_slow_ops", | ||
"dump_mempools", | ||
"dump_ops_in_flight", | ||
"dump_op_pq_state", | ||
"dump_osd_network", | ||
"dump_reservations", | ||
"dump_watchers", | ||
"log dump", | ||
"perf dump", | ||
"perf histogram dump", | ||
"objecter_requests", | ||
"ops", | ||
"status", | ||
"version", | ||
] | ||
|
||
self.add_cmd_output( | ||
[f"ceph daemon {i} {c}" for i in self.get_socks() for c in cmds] | ||
) | ||
microceph_pkg = self.policy.package_manager.pkg_by_name('microceph') | ||
if not microceph_pkg: | ||
self.add_file_tags({ | ||
"/var/log/ceph/(.*/)?ceph-(.*-)?osd.*.log": 'ceph_osd_log', | ||
}) | ||
|
||
self.add_forbidden_path([ | ||
"/etc/ceph/*keyring*", | ||
"/var/lib/ceph/**/*keyring*", | ||
# Excludes temporary ceph-osd mount location like | ||
# /var/lib/ceph/tmp/mnt.XXXX from sos collection. | ||
"/var/lib/ceph/**/tmp/*mnt*", | ||
"/etc/ceph/*bindpass*" | ||
]) | ||
|
||
# Only collect OSD specific files | ||
self.add_copy_spec([ | ||
"/run/ceph/**/ceph-osd*", | ||
"/var/lib/ceph/**/kv_backend", | ||
"/var/log/ceph/**/ceph-osd*.log", | ||
"/var/log/ceph/**/ceph-volume*.log", | ||
]) | ||
|
||
self.add_cmd_output([ | ||
"ceph-disk list", | ||
"ceph-volume lvm list" | ||
]) | ||
|
||
cmds = [ | ||
"bluestore bluefs available", | ||
"config diff", | ||
"config show", | ||
"dump_blacklist", | ||
"dump_blocked_ops", | ||
"dump_historic_ops_by_duration", | ||
"dump_historic_slow_ops", | ||
"dump_mempools", | ||
"dump_ops_in_flight", | ||
"dump_op_pq_state", | ||
"dump_osd_network", | ||
"dump_reservations", | ||
"dump_watchers", | ||
"log dump", | ||
"perf dump", | ||
"perf histogram dump", | ||
"objecter_requests", | ||
"ops", | ||
"status", | ||
"version", | ||
] | ||
|
||
self.add_cmd_output( | ||
[f"ceph daemon {i} {c}" for i in self.get_socks() for c in cmds] | ||
) | ||
|
||
else: | ||
# Only collect microceph files, don't run any commands | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why don't we want to run any commands? I would expect to still want all the OSD commands regardless of whether we are using microceph or not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in microceph, I can run the ceph commands (on the mon nodes) but not the ceph daemon commands on the OSD nodes that the OSD nodes run for ceph_osd collections.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
actually I think this might be a bug in microceph so I've raised canonical/microceph#160 , if its supposed to work then I'll change the code in ceph_osd to also run the ceph daemon commands. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll submit a new PR later to add the ceph daemon command collections once canonical/microceph#160 gets resolved. |
||
self.add_forbidden_path([ | ||
"/var/snap/microceph/common/**/*keyring*", | ||
"/var/snap/microceph/current/**/*keyring*", | ||
"/var/snap/microceph/common/state/*", | ||
]) | ||
|
||
self.add_copy_spec([ | ||
"/var/snap/microceph/common/data/osd/*", | ||
"/var/snap/microceph/common/logs/*ceph-osd*.log", | ||
]) | ||
|
||
def get_socks(self): | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of duplicating all of the copy specs, could we calculate a path prefix instead?
e.g.
something like this used to be done for containers although I think it was since mostly replaced with the logic for looking for paths inside containers instead. so most of the code that did similar seems gone now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was avoiding touching any of the existing ceph code, so apart from an else i tried to not change it. I can do it this way, sure..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The files we collect aren't the same, for eg the calamari ones are not present in the microceph env, so for eg, in the ceph_common now, (am using the snap detection method here)
The forbidden paths are also unnecessary for microceph since we don't collect anything but those two logs. So since the microceph log collections are smaller than ceph, and that holds true for the other ceph plugins too, I've separated them out, hence the duplicate add_copy_spec() calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @lathiat, if we have common file paths, then it would be easier to change one line, rather than 2 lines foe anything we are collecting. It just means less error-prone and less items to change if there are nay changes required in the future.
imo, if a file doesn't exist, then there isn't a real problem, it will be skipped by sos anyway. So, having a common set should reduce the amount of items we add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at ceph_common as an example, I've tried making the change recommended in the review to have a common PATH variable and use only one code path for the file collections, but I still feel the code is eventually cleaner and easier to follow with the separation between the two plugins with one big "if".
The microceph stuff in ceph_common plugin as of now only collects two log files as opposed to several more in the ceph path,
The ceph_common plugin as of now collects more stuff, so for eg consider, the only thing in common is marked in <==
So I think it's still more reasonable to separate it out with a "if" instead of trying to have common code. It's also easier in the future to make changes to either plugin if there's a clean separation.
If I look at ceph_osd, as another eg,
It's not necessary that microceph would collect a subset of the ceph plugin, the plugins may differ in the files they collect (and they do even now) because the /var/snap/microceph/common/data/osd/* folder in microceph would collect as of now all these files, (minus the keyring which is in the forbidden path)
I think if we have a common path, it would become more complex in the future maintaining this special situation we have, where we are triggering one of two plugins from the one plugin file. So I'd much rather have a clean separation instead of trying to run common code using path variables.
Please let me know whether this makes sense and if the separation is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, makes sense to me