Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dissasembling failing in xasm format #93

Open
Vaipex opened this issue May 31, 2022 · 6 comments
Open

Dissasembling failing in xasm format #93

Vaipex opened this issue May 31, 2022 · 6 comments

Comments

@Vaipex
Copy link

Vaipex commented May 31, 2022

Hi,

I'm currently trying to extract the bytecode, edit a few strings and assemble it back to a .pyc file.
Pydisasm without any flags work just fine but as soon as I try to dissamble the file with Pydisasm -F xasm ./file.pyc it fails with the following traceback:

Traceback (most recent call last):
  File "/usr/local/bin/pydisasm", line 33, in <module>
    sys.exit(load_entry_point('xdis', 'console_scripts', 'pydisasm')())
  File "/usr/lib/python3/dist-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/root/python-xdis/xdis/bin/pydisasm.py", line 72, in main
    disassemble_file(path, sys.stdout, format)
  File "/root/python-xdis/xdis/disasm.py", line 329, in disassemble_file
    disco(
  File "/root/python-xdis/xdis/disasm.py", line 160, in disco
    disco_loop_asm_format(opc, version_tuple, co, real_out, {}, set([]))
  File "/root/python-xdis/xdis/disasm.py", line 220, in disco_loop_asm_format
    disco_loop_asm_format(
  File "/root/python-xdis/xdis/disasm.py", line 220, in disco_loop_asm_format
    disco_loop_asm_format(
  File "/root/python-xdis/xdis/disasm.py", line 249, in disco_loop_asm_format
    assert mapped_name not in fn_name_map
AssertionError

I also printed out the vars from the assert:

mapped_name='listcomp_0x7f3d301932f0'

fn_name_map={'listcomp_0x7f3d30192ff0': 'listcomp', 'listcomp_0x7f3d301932f0': 'listcomp'}

@rocky
Copy link
Owner

rocky commented Jun 1, 2022

In order for me to work on, I'd need a complete short example with the pyc you started out with, the disassembly of that, the change to the assembly, and finally the resulting pyc. The shortest example that shows this is desirable.

@Vaipex
Copy link
Author

Vaipex commented Jun 1, 2022

I never got to the point of successfully disassembling the .pyc so its not newly assembled but here is one of the failing files.

test.zip

@rocky
Copy link
Owner

rocky commented Jun 1, 2022

Ah - I see what's up. If I or someone else doesn't answer this in a week or so, remind me.

@Vaipex
Copy link
Author

Vaipex commented Jun 1, 2022

alright, thank you!

@Vaipex
Copy link
Author

Vaipex commented Jun 7, 2022

any updates? @rocky

@rocky
Copy link
Owner

rocky commented Jun 8, 2022

Here is my understanding of the situation.

Some background first.

For each list comprehension that appears in Python code, a code object is created for the "body" of the code. For example if you write:

[x + 1 for x in collection]

Parts of the disassembly will look like:

# Source code size mod 2**32: 26 bytes
# Method Name:       <module>
...
# Stack size:        2
# Flags:             0x00000040 (NOFREE)
# First Line:        1
# Constants:
#    0: <code object <listcomp> at 0x7fe0b8beb9f0, file "lc.py", line 1>
#    1: '<listcomp>'
#    2: None
# Names:
#    0: __file__
  1:           0 LOAD_CONST           (<code object <listcomp> at 0x7fe0b8beb9f0, file "lc.py", line 1>)
...
# Method Name:       <listcomp>
...
  1:           0 BUILD_LIST           0
               2 LOAD_FAST            (.0)

The function or method named <listcomp> is created for the part of the source code x + 1

If there is another list comprehension , another code object with the same method named <listcomp> is created.

The way the disassembler disambiguates the different <listcomp> methods is to append the hex address, e.g.0x7fe0b8beb9f0 to the end of the name.

Apparently there are two listcomp methods with the same name including the hex address.

I understand how that is possible, but apparently it is.

I believe a simple workaround is to run the disassembler with a Python interpreter that matches the bytecode inside the bytecode.

When that is done, instead of xdis' structure for a code object, the "native" structure of the code object is used, I think no name mapping is needed.

I could be wrong here though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants