Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a feature to build a binary debug database at compile time #130

Draft
wants to merge 1 commit into
base: gcos4gnucobol-3.x
Choose a base branch
from

Conversation

ddeclerck
Copy link
Contributor

Disclaimer: this is still a work in progress

This PR adds the ability to collect and dump various information useful for debugging, in particular:

  • a mapping from COBOL locations to C locations
  • information about the different storage of each embedded program (name of COBOL field and attributes, corresponding C variable and offset, etc...)

This information can then used either by a tool that drives gdb or by a gdb extension to improve the debugging experience.
We plan to use it to improve debugging in our VSCode extension.

@codecov-commenter
Copy link

Codecov Report

Attention: 174 lines in your changes are missing coverage. Please review.

Comparison is base (c0d64ad) 65.74% compared to head (b7ff8a3) 65.55%.

Files Patch % Lines
cobc/debuggen.c 0.00% 163 Missing ⚠️
cobc/codegen.c 0.00% 4 Missing and 4 partials ⚠️
cobc/cobc.c 0.00% 2 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@                  Coverage Diff                  @@
##           gcos4gnucobol-3.x     #130      +/-   ##
=====================================================
- Coverage              65.74%   65.55%   -0.20%     
=====================================================
  Files                     32       33       +1     
  Lines                  59092    59267     +175     
  Branches               15575    15599      +24     
=====================================================
+ Hits                   38849    38851       +2     
- Misses                 14262    14430     +168     
- Partials                5981     5986       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@GitMensch
Copy link
Collaborator

FAT WARNING: we do have COBC_GEN_DUMP_COMMENTS which write that data in GC 3.1+ (used by cobcd) and a binary debug tree that can be generated with GC4 (attached to the module in question). There has to be a very compelling reason to add something else, instead I'd highly suggest to inspect both other approaches.

@ddeclerck
Copy link
Contributor Author

a binary debug tree that can be generated with GC4

Are you referring to the cb_tree_print function in cobc/debug.c ?

@GitMensch
Copy link
Collaborator

I'd have to check what this is about (and most important where it comes from), I'm referencing the generated cob_symbol structure

gnucobol/libcob/common.h

Lines 1204 to 1230 in dc0a51e

typedef struct __cob_symbol {
unsigned int parent; /* Index to parent cob_symbol */
unsigned int sister; /* Index to sister cob_symbol */
char *name; /* Field name, NULL means FILLER */
void *adrs; /* Pointer to data pointer */
const cob_field_attr *attr; /* Pointer to attribute */
unsigned int is_file:1; /* 'data' points to FILE pointer */
unsigned int is_indirect:2;/* 'data' points to the field's pointer */
#define SYM_ADRS_DATA 0 /* 'adrs' is direct address of field data */
#define SYM_ADRS_PTR 1 /* 'adrs' is address of address of field data */
#define SYM_ADRS_FIELD 2 /* 'adrs' is address of complete cob_field */
#define SYM_ADRS_VARY 3 /* 'adrs' varys due to prior DEPENDING ON */
unsigned int level:7; /* Level number */
unsigned int section:3; /* SECTION of program */
unsigned int is_group:1; /* Field was Group item */
unsigned int is_redef:1; /* Field has REDEFINES */
unsigned int has_depend:1;/* Field has DEPENDING ON */
unsigned int subscripts:5;/* Field requires N subscripts */
unsigned int unused:11;
unsigned int offset; /* Offset in record, May be ZERO for LINKAGE fields */
unsigned int size; /* Field size */
unsigned int depending; /* Index to DEPENDING ON field */
unsigned int occurs; /* Max number of OCCURS */
unsigned int roffset; /* Original Offset within record */
} cob_symbol;
which also could be extended as needed.

The idea of that was that you don't need any magic C translation at all, because each cob_module with COBOL debug symbols has an entry for each variable and you can directly use those instead. This also works fine in a core-dump which preserves the data (and if the data is not preserved, then you don't need variable symbols either).

And additional to a possible debugger this is also be used for "runtime based" dumping in case of errors, which removes a huge portion of generated code from the C modules.
The only "con" is that you cannot have the debug symbols in an external file so only have it loaded when needed (if one separates this then a magic number needs to be inserted into the module, allowing a debugger to check if the symbol file matches the compile and at least output a warning / do additional checks before accessing the data).

@ddeclerck
Copy link
Contributor Author

That could fit our needs.

While I prefer having the debug info in a separate file (similarly to MSVC's .pdb files), I can accomodate to embedded debug info (after all, that's how gcc behaves by default).

However (correct me if I'm wrong), the current GC4 debug info feature does not produce a map of C/COBOL locations, right ? That could be a valuable addition : I find #line directives slightly unpractical (although it is their purpose...), plus, you don't always have the generated C file at hand.

Also, is there any plan to backport this feature in the 3.x branch ?

@GitMensch
Copy link
Collaborator

the current GC4 debug info feature does not produce a map of C/COBOL locations, right

it doesn't need, because the COBOL line numbers are visible to GCC already (currently in GC4 when compiling with -G (uppercase) you likely have COBOL only in - but this may get removed).
because of the #line directives you don't need the C sources.

@ddeclerck
Copy link
Contributor Author

ddeclerck commented Jan 11, 2024

it doesn't need, because the COBOL line numbers are visible to GCC already (currently in GC4 when compiling with -G (uppercase) you likely have COBOL only in - but this may get removed). because of the #line directives you don't need the C sources.

Now that's awkward. Last time I checked (quite a long time ago, on my previous computer, with an outdated Linux distro), I could't get gdb to place a breakpoint in a COBOL file (eg b file.cob:42) despite the #line directives, and so I assumed that was the normal behavior (not like it's very documented). But I tried again right now and it "just works". Good.

Now, we'd just need that on the 3.x branch.

@GitMensch
Copy link
Collaborator

If you really want that badly, then I'm ok with a merge request.

To do so you'd first check which commits adjusted the code related to symbols (via git blame if the import is good, otherwise via svn blame), then check those revisions for unrelated changes, if there aren't any you can locally test that with an svn merge, this may result inn 98% working code.
Otherwise you'd need to note down unrelated changes, then do an svn diff and revert those unrelated parts, still we'd like to know the svn revision numbers.

Important: the removal of COB_GEN_DUMP_COMMENTS code is not possible to not break cobcd; only the "real dump" code would not be used (= the old code only generated as a comment, the new code additional to generate the new structure); the old dump code would have to stay in libcob (for modules generated with GC 3+) and then new code for dumping be included additional.

Depending on how familiar you are with the process I'd guess this to be a work of 1-4 working days; but if done that way, it would be fine to be integrated into GC 3.x.

@ddeclerck ddeclerck added the duplicate This issue or pull request already exists label Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants