Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored and Optimized Logic:: Parser Logic, Latex Based Output & Shared Modules #283

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

IIITM-Jay
Copy link
Member

@IIITM-Jay IIITM-Jay commented Sep 16, 2024

This PR intends to refactor the parse.py to achieve maintainability and scalability.

Scopes Covered

  1. Modularization: Breaking the codes in small and maintainable helper functions containing understandable lines of code
  2. Optimization: Reducing number of lines of codes using techniques such as list comprehensions etc.
  3. Removing Redundancy: Removing duplicacy and utilizing reusablity of logics
  4. Refactorization: Refactor a long script into dedicated scripts for scalability

Approach Followed

  1. The parse.py file now contains only the main function where it calls respective scripts for generating the output based on extensions selected.
  2. Common methods/ functions are moved to shared_utils.py so that other scripts can re-use them efficiently. These functions are shared between various extension scripts
  3. The output for Latex is refactored and optimized in larex_utils.py
  4. Scripts: parse.py and shared_utils.py are refactored, modularized and optimized as well
  5. Separate dedicated scripts being created for each extension

Needs to be done

  • The codes for other extension outputs are only moved to their respective scripts(except for latex). They require improvements and enhancements. They are just simply taken from parse.py to their modules like c_utils.py for generating c based output
  • In the shared_utils.py, the method create_inst_dict() is yet to be refactored.

@IIITM-Jay
Copy link
Member Author

IIITM-Jay commented Sep 16, 2024

Hi @aswaterman and @rpsene, I have made an attempt to refactor the parse.py. For now, only latex_utils.py, shared_utils.py and parse.py and refactored, optimized and modularized.

The test cases are failing as we need to modify the tests.py as well based on the accepted refactored code.
P.S. Note: For the time being, I have imported the shared module methods for running the test cases, now all checks passed

Requesting feedback on the modified code and suggestions on what best we can do to achieve maintainability and scalability.

@IIITM-Jay
Copy link
Member Author

Hi @aswaterman, in addition to the written explanation of the parsing logic inside "Flow of parse.py" section of README file, I believe including flowcharts would greatly enhance the clarity and readability of the process. By visualizing the flow, readers can get a clearer understanding of the key steps involved in parsing instruction encodings.

Like, we have three main steps with each having sequence of procedures:
1. The first pass, we cover only the regular expression and follow the below steps:

flowchart TD
    A[Start: parse.py] --> B[Create list of all rv* files]
    B --> C{File contains regular instructions?}
    C -->|Yes| D[Parse file line by line]
    D --> E[Perform checks on regular instructions]
    
    E --> F[Check 1: msb > lsb in range assignment]
    E --> G[Check 2: Value representable in range]
    E --> H[Check 3: No multiple assignments to same bit]
    E --> I[Check 4: All bit positions must be accounted for]
    
    F --> J[Pass checks?]
    G --> J
    H --> J
    I --> J
    
    J -->|Yes| K[Create dictionary for regular instruction]
    K --> L[Add encoding, extension, mask, match, variable_fields]
    L --> M[Add to instr_dict]
    
    M --> N[Process next regular instruction]
    N -->|All regular instructions processed| O[End of Regular Instruction Parsing]
Loading

@IIITM-Jay
Copy link
Member Author

2. In the second pass, we do the checks for pseudo_instr carrying out similar procedure.

3. In the last step, the output generation,

flowchart TD
    A[Start Output Generation] --> B[Generate LaTeX tables]
    B --> C[Generate encoding.h file]
    C --> D[Generate other artifacts]

    D --> E[Output files generated]
    E --> F[End]

Loading

@IIITM-Jay
Copy link
Member Author

IIITM-Jay commented Sep 21, 2024

@aswaterman and @rpsene , These flowchart creation process are markdown friendly. I think it will also help to give a nutshell view of what we are doing. Let me know the feedback and suggestions, if these will add up to an enhancement for the existing repository.

@IIITM-Jay
Copy link
Member Author

IIITM-Jay commented Sep 25, 2024

@aswaterman and @rpsene , Refactored and Optimized the method used for creating instruction dictionary.

P.S. The second point from Needs to be Done header in PR description is ticked(marked as completed) as of now

@IIITM-Jay
Copy link
Member Author

IIITM-Jay commented Oct 2, 2024

The conflicts arises as after the commit w.r.t walrus operator made in parse.py.

@aswaterman, the walrus operator (:=) is compatible with Python 3.8 and later versions, as it was introduced in Python 3.8, and it allows assignment within an expression, making it useful for situations where we want to assign and evaluate a variable in the same line.
As far as I know, if I may not be wrong, python 3.6 is already having security vulnerabilities and reached its end in 2021.

So, wanted to know, whether we need to keep in mind with older versions compatibility of Python while doing refactorization and optimization.
Thanks!

@aswaterman
Copy link
Member

I'll try to provide feedback on this PR soon. To answer the immediate question: if we have good reason to start making extensive use of newer Python features, I have no objections to upgrading, but I did not think that a single use of the walrus operator was a sufficient justification for requiring a newer Python version.

@IIITM-Jay
Copy link
Member Author

but I did not think that a single use of the walrus operator was a sufficient justification for requiring a newer Python version.

Yes, I do agree with this. While going through the files in this repository, I found that the walrus operator is the only python 3.8 version feature that is being used. Although there are lot of other features in later versions such as Positional-Only Parameters, TypeDict etc., but those need not be used extensively here as they are not required and they do not add any additional benefits

@aswaterman
Copy link
Member

You've convinced me that other new features are a good enough justification to move to 3.8.

Copy link
Member

@aswaterman aswaterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very reasonable refactoring. I presume at this point there will be some merge conflicts with respect to other recent changes to the master branch, but feel free to merge once you've resolved them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants