Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative implementation of the policy document parser #3246

Merged
merged 7 commits into from
Oct 25, 2024

Conversation

narrieta
Copy link
Member

Discussed with @mgunnala an alternative approach to write the parse for policy documents. It is similar in concept to a recursive descendent parser and produces code that is simpler to write, read, and debug.

We also decided to make the document version a required attribute.

Lastly, I added code to also parse the runtimePolicy for extensions.

_PolicyEngine.__parse_version(template, custom_policy)
_PolicyEngine.__parse_extension_policies(template, custom_policy)
if not isinstance(policy, dict):
raise InvalidPolicyError("expected an object describing a Policy; got {0}.".format(type(policy).__name__))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: policy doesn't need to be capitalized

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks; in that message, I'm trying to refer to the concept/type of "Policy", rather than a specific "policy" instance.

if that does not make sense, i can lowercase it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if that makes sense to me, but if it does to you, you can leave it uppercase

if not isinstance(policy, dict):
raise InvalidPolicyError("expected an object describing a Policy; got {0}.".format(type(policy).__name__))

_PolicyEngine._check_attributes(policy, object_name="policy", valid_attributes=["policyVersion", "extensionPolicies"])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we get valid_attributes from a schema constant instead of hardcoding them here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i want them to be inline so that people can scan/read the code linearly, without having to jump to the definition of a constant to check what the valid attributes are

for extension_name, individual_policy in extensions.items():
for extension, extension_policy in extensions.items():
if not isinstance(extension_policy, dict):
raise InvalidPolicyError("invalid type {0} for attribute '{1}', must be 'object'".format(type(extension_policy).__name__, extension))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: semicolon before "must" for consistency with the error message in get_value( )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks; fixed

raise InvalidPolicyError(
"invalid value for attribute 'policyVersion' attribute 'policyVersion' is expected to be in format 'major.minor.patch' "
"(e.g., '1.0.0'). Please change to a valid value.")
if not re.match(r"^\d+\.\d+\.\d+$", version):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also accept "1" or "1.0" as valid versions (I should have added a unit test for too).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the strict check since that is what the error message was specifically asking ('major.minor.patch'). Note, though, that this check should be done with a regex or similar, since FlexibleVersion accepts many, many other formats than 'major.minor.patch'.

"1", and "1.0" sound ok to me. If you want, you could add that after this change gets merged in

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it in the next PR!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgunnala - I needed to add some unit tests to address other comments, so I went ahead and made this change and added the corresponding unit tests,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!!

"invalid value for attribute 'policyVersion' attribute 'policyVersion' is expected to be in format 'major.minor.patch' "
"(e.g., '1.0.0'). Please change to a valid value.")
if not re.match(r"^\d+\.\d+\.\d+$", version):
raise InvalidPolicyError("invalid value for attribute 'policyVersion'; it should be in format 'major.minor.patch' (e.g., '1.0.0')")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe include the provided version in the error message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks; added


# The runtimePolicy is an arbitrary object.
runtime_policy = extension.get("runtimePolicy")
if runtime_policy is not None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the schema defined spec, runtimePolicy is an object with arbitrary attributes, so we should probably validate that the type is dict.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or if we want to accept list, string, etc., update the schema and spec accordingly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the extension policy is completely arbitrary and defined by the extension. spec needs to be updated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note about this to the spec

}
for k in object_.keys():
if k not in valid_attributes:
raise InvalidPolicyError("invalid attribute '{0}' in {1}".format(k, object_name))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "unrecognized" or "undefined" instead of "invalid" to make it clear that the attribute is not defined in the schema, as opposed to the type or format being invalid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalid: (of computer instructions, data, etc.) not conforming to the correct format or specifications

your comment made me doubt :)

but i can change it to "unknown" if it helps

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your usage isn't wrong but "invalid" sounds to me like a format/type issue. I think "unknown" is clearer, but I'm also fine with leaving it as-is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went for "unrecognized", after all

thanks for the suggestion

@staticmethod
def _get_dictionary(object_, attribute, name_prefix="", optional=False, default=None):
"""
Returns object[attribute] if it exists, verifying that it is a dictionary, else returns default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we update the comment to indicate this raises InvalidPolicyError if type verification fails

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, updated

azurelinuxagent/ga/policy/policy_engine.py Show resolved Hide resolved
azurelinuxagent/ga/policy/policy_engine.py Show resolved Hide resolved
azurelinuxagent/ga/policy/policy_engine.py Show resolved Hide resolved
"allowListedExtensionsOnly": _PolicyEngine._get_boolean(extension_policies, attribute="allowListedExtensionsOnly", name_prefix="extensionPolicies.", optional=True, default=_DEFAULT_ALLOW_LISTED_EXTENSIONS_ONLY),
"signatureRequired": _PolicyEngine._get_boolean(extension_policies, attribute="signatureRequired", name_prefix="extensionPolicies.", optional=True, default=_DEFAULT_SIGNATURE_REQUIRED),
"extensions": _PolicyEngine._parse_extensions(
_PolicyEngine._get_dictionary(extension_policies, attribute="extensions", name_prefix="extensionPolicies.", optional=True, default=_CaseFoldedDict.from_dict(_DEFAULT_EXTENSIONS))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_PolicyEngine._get_dictionary(extension_policies, attribute="extensions", name_prefix="extensionPolicies.", optional=True, default=_CaseFoldedDict.from_dict(_DEFAULT_EXTENSIONS))
_PolicyEngine._get_dictionary(extension_policies, attribute="extensions", name_prefix="extensionPolicies.", optional=True, default=__DEFAULT_EXTENSIONS)

nit: we only return CaseFoldedDict in _parse_extensions, so not necessary to create case folded dict here too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point; removed


for extension, extension_policy in extensions.items():
if not isinstance(extension_policy, dict):
raise InvalidPolicyError("invalid type {0} for attribute '{1}'; must be 'object'".format(type(extension_policy).__name__, extension))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we specify full path of the attribute in the policy doc here?
extensionPolicies.extensions.{extension}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup; added

@@ -311,7 +305,7 @@ def should_enforce_signature_validation(self, extension_to_check):

global_signature_required = self._policy.get("extensionPolicies").get("signatureRequired")
individual_policy = self._policy.get("extensionPolicies").get("extensions").get(extension_to_check.name)
if individual_policy is None:
if individual_policy is None or len(individual_policy) == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if individual policy has 'runtimePolicy' attribute, but not 'signatureRequired' attribute?
Then we enter else condition and return None.

I think instead we should check

if individual_policy is None or individual_policy.get('signatureRequired') is None

and add unit test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoa, good catch!

I could try to explain why I missed this, but that would sound just like an excuse for my laziness. Updated code and added unit test.

@mgunnala - I refactored some of the unit tests as part of this

@narrieta
Copy link
Member Author

@mgunnala - FYI: as part of my latest changes, I changed ExtensionPolicyEngine.should_allow_extension() and ExtensionPolicyEngine.should_enforce_signature_validation() to take an extension name, rather than an Extension object. This is in order to avoid the implicit dependency on azurelinuxagent.common.protocol.restapi; there is no need to depend on that module.

Copy link
Contributor

@maddieford maddieford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just left 2 comments (on comments)

@@ -273,7 +277,7 @@ def _get_value(object_, attribute, name_prefix, type_, type_name, optional, defa

class ExtensionPolicyEngine(_PolicyEngine):

def should_allow_extension(self, extension_to_check):
def should_allow_extension(self, extension_name):
"""
Return whether we should allow extension download based on policy.
extension_to_check is expected to be an Extension object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer applicable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks; updated


@staticmethod
def __parse_version(template, policy):
def _parse_policy_version(policy):
"""
Validate and return "policyVersion" attribute. If not a string in the format "x.y.z", raise InvalidPolicyError.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'If not a string in the format "x.y.z"'

nit: 'If not a string in the format "major[.minor[.patch]]"'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks; updated

@narrieta narrieta merged commit 9c565c7 into Azure:develop Oct 25, 2024
11 checks passed
@narrieta narrieta deleted the parser branch October 25, 2024 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants