Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading values missing "E" character for exponentials #50

Open
MrDeoth opened this issue Dec 12, 2022 · 4 comments
Open

Reading values missing "E" character for exponentials #50

MrDeoth opened this issue Dec 12, 2022 · 4 comments

Comments

@MrDeoth
Copy link

MrDeoth commented Dec 12, 2022

Hello,

I am using FISPACT and noticed that the extract_boundaries_and_values function in gammaspectrum.py for output treatment has issues treating outputs with missing "E" character for exponential format.
gammaspectrum.txt

I added a "patch" from my Python experience wich is not optimal of course.

I hope this is the correct way of contributing. If this is not welcomed, please notify me.
If this is welcomed but clearly not optimal, I'll be happy to know a better method.

Best regards

@thomasms
Copy link
Member

Hi, thanks for finding this issue.

Looking at the code that could indeed be the case, although I have not seen a test case showing this yet. Could you perhaps share the file that is causing the issue, so I could add it as a test case? If the spectrum is sensitive then perhaps you can mutate the numbers but preserve the failing format.

From your attachment, I assume you're proposing the following fix.

import re

from pypact.output.tags import GAMMA_SPECTRUM_SUB_HEADER
from pypact.util.decorators import freeze_it
from pypact.util.jsonserializable import JSONSerializable

FLOAT_NUMBER = r"[0-9]+(?:\.(?:[0-9]+))?(?:e?(?:[-+]?[0-9]+)?)?"
GAMMA_SPECTRUM_LINE = \
    r"[^(]*\(\s*(?P<lb>{FN})\s*-\s*(?P<ub>{FN})\s*MeV\)\s*(?P<value>{FN})\D*(?P<vr>{FN}).*".format(
        FN=FLOAT_NUMBER,
    )
GAMMA_SPECTRUM_LINE_MATCHER = re.compile(GAMMA_SPECTRUM_LINE, re.IGNORECASE)


@freeze_it
class GammaSpectrum(JSONSerializable):
    """
        The gamma spectrum type from the output
    """

    def __init__(self):
        self.boundaries = []  # TODO dvp: should be numpy arrays (or even better xarrays)
        self.values = []
        self.volumetric_rates = []

    def fispact_deserialize(self, file_record, interval):
        self.__init__()

        lines = file_record[interval]

        def extract_boundaries_and_values(_lines):
            header_found = False
            for line in _lines:
                if not header_found:
                    if GAMMA_SPECTRUM_SUB_HEADER in line:
                        header_found = True
                if header_found:
                    if line.strip() == "":
                        return
                    match = GAMMA_SPECTRUM_LINE_MATCHER.match(line)
                    lower_boundary = float(match.group("lb"))
                    upper_boundary = float(match.group("ub"))
                    value_str = match.group("value")
                    if "E" not in value_str :
                        splitted_value_str = value_str.split("-")
                        splitted_value_str = [splitted_value_str[0], "E-", splitted_value_str[1]]
                        value_str = "".join(splitted_value_str)
                    value = float(value_str)
                    volumetric_rate_str = match.group("vr")
                    if "E" not in volumetric_rate_str :
                        splitted_volumetric_rate_str = volumetric_rate_str.split("-")
                        splitted_volumetric_rate_str = [splitted_volumetric_rate_str[0], "E-", splitted_volumetric_rate_str[1]]
                        volumetric_rate_str = "".join(splitted_volumetric_rate_str)
                    volumetric_rate = float(volumetric_rate_str)
                    yield lower_boundary, upper_boundary, value, volumetric_rate

        boundaries = []
        values = []
        volumetric_rates = []

        for lb, ub, v, vr in extract_boundaries_and_values(lines):
            if not boundaries:
                boundaries.append(lb)
            boundaries.append(ub)
            values.append(v)
            volumetric_rates.append(vr)

        if values:
            self.boundaries = boundaries
            self.values = values
            self.volumetric_rates = volumetric_rates

This could work, but I am now thinking we should probably use the utility function to handle this:
https://github.com/fispact/pypact/blob/master/pypact/util/numerical.py#L12

There are some tests already to try and cover this case - is your failing float an example of one of these tests?
https://github.com/fispact/pypact/blob/master/tests/util/numericaltest.py

@MrDeoth
Copy link
Author

MrDeoth commented Dec 13, 2022

Hi,
Thanks for aswering.
I'd rather not send you my files because I don't know in what extent I am allowed to share anything, even with artificial data.
The number format with causes this issue is indeed in fortrant float style "-2.34321-308" (which I didn't know it existed until now).
Using the utility function is clearly a better option since mine would cause issues with negative values in fortran format. I successfully tested it in my case replacing the float() functions by get_float() from numerical.py.

Thanks

@MrDeoth
Copy link
Author

MrDeoth commented Dec 13, 2022

PS : Replacing the float() functions in gammaspectrum.py.

@MrDeoth MrDeoth closed this as completed Dec 15, 2022
@thomasms thomasms reopened this Dec 22, 2022
@thomasms
Copy link
Member

Going to reopen this to fix as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants