Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support collating sequence for indexed file keys of alphanumeric class #177

13 changes: 12 additions & 1 deletion cobc/ChangeLog
Original file line number Diff line number Diff line change
@@ -1,12 +1,23 @@

2024-09-12 Nicolas Berthier <[email protected]>

* tree.c (validate_indexed_key_field): warn about ignored collating
sequence for non-alphanumeric keys (considers only primary keys and file
collating sequence for now)
* codegen.c (output_indexed_file_key_colseq): assign collating sequence
for any key of alphanumeric class, and preliminary handing of NATIONAL
collations
* parser.y: adjust position of messages about unfinished KEY or FILE
COLLATING SEQUENCE

2024-09-29 Simon Sobisch <[email protected]>

* cobc.c (cobc_print_info): drop COB_LI_IS_LL
in favor of existing COB_32_BIT_LONG

2024-09-27 Simon Sobisch <[email protected]>

* plex.l, scanner.l: use noyywrap option instead of manually
* plex.l, scanner.l: use noyywrap option instead of manually
defining related code parts
* typeck.c (cb_tree_list_has_numeric_ref_or_field): cleanup

Expand Down
12 changes: 5 additions & 7 deletions cobc/codegen.c
Original file line number Diff line number Diff line change
Expand Up @@ -9341,16 +9341,14 @@ output_indexed_file_key_colseq (const struct cb_file *f, const struct cb_alt_key
{
const cb_tree key = ak ? ak->key : f->key;
const cb_tree key_col = ak ? ak->collating_sequence_key : f->collating_sequence_key;
const int type = cb_tree_type (key, cb_code_field (key));
cb_tree col = NULL;

/* We only apply a collating sequence if the key is alphanumeric / display */
if ((type & COB_TYPE_ALNUM) || (type == COB_TYPE_NUMERIC_DISPLAY)) {
/* We only apply a collating sequence if the key is of class alphanumeric;
Warned in `validate_indexed_key_field`. */
if (CB_TREE_CLASS (key) == CB_CLASS_ALPHANUMERIC) {
col = key_col ? key_col : f->collating_sequence;
#if 0 /* TODO: this should be done for national, when available */
} else if (type & COB_TYPE_NATIONAL) {
col = key_col_n ? key_col_n : f->collating_sequence_n;
#endif
} else if (CB_TREE_CLASS (key) == CB_CLASS_NATIONAL) {
col = f->collating_sequence_n;
}

output_prefix ();
Expand Down
4 changes: 2 additions & 2 deletions cobc/parser.y
Original file line number Diff line number Diff line change
Expand Up @@ -5784,7 +5784,7 @@ collating_sequence_clause:
check_repeated ("COLLATING", SYN_CLAUSE_3, &check_duplicate);
current_file->collating_sequence = alphanumeric_collation;
current_file->collating_sequence_n = national_collation;
CB_UNFINISHED ("FILE COLLATING SEQUENCE"); /* only implemented for BDB */
CB_UNFINISHED_X (alphanumeric_collation, "FILE COLLATING SEQUENCE"); /* only implemented for BDB */
}
;

Expand Down Expand Up @@ -5836,7 +5836,7 @@ collating_sequence_clause_key:
and also attached to the correct key later, so just store in a list here: */
current_file->collating_sequence_keys =
cb_list_add(current_file->collating_sequence_keys, CB_BUILD_PAIR ($6, $4));
CB_UNFINISHED ("KEY COLLATING SEQUENCE"); /* only implemented for BDB */
CB_UNFINISHED_X ($6, "KEY COLLATING SEQUENCE"); /* only implemented for BDB */
}
;

Expand Down
21 changes: 21 additions & 0 deletions cobc/tree.c
Original file line number Diff line number Diff line change
Expand Up @@ -4781,6 +4781,27 @@ validate_indexed_key_field (struct cb_file *f, struct cb_field *records,
}
}
}

/* check collating sequence is not ignored */
if (get_warn_opt_value (cb_warn_filler) != COBC_WARN_DISABLED
&& CB_TREE_CLASS (k) != CB_CLASS_ALPHANUMERIC) {
const char *source = "KEY";
cb_tree colseq = (cbak == NULL)
? f->collating_sequence_key
: cbak->collating_sequence_key;
cb_tree pos = colseq;
if (colseq == NULL) {
source = "FILE";
colseq = f->collating_sequence;
pos = key_ref;
}
if (colseq != NULL) {
cb_warning_x (COBC_WARN_FILLER, CB_TREE (pos),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preferable would be a new warning group -Wignored-something enabled by default (and add the handful of "ignored" warnings to the same group, but as I currently don't have a clue for a good name and it isn't important this can be done in a follow-up commit :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May ignored-collations be too precise for this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work but I'd prefer a warning option that we can use for any ignored elements (so yes, too specific ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So that'd be a category like ignored-file-control-clause or something even more general. I suggest to leave this for a subsequent PR (I won't be available in the very near future to work on that — currently in (pre-)holiday transit, somewhere in Europe ;).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even more - totally global - so we can include every "ignored" that has no separate warning level (that should be 1 or two warnings currently).

And yes, that's for "sometime in the future".

_("%s COLLATING SEQUENCE '%s' is ignored "
"for non-alphanumeric key '%s'"),
source, CB_NAME (colseq), k->name);
}
}
}

void
Expand Down
159 changes: 152 additions & 7 deletions tests/testsuite.src/run_file.at
Original file line number Diff line number Diff line change
Expand Up @@ -12444,7 +12444,7 @@ AT_CLEANUP


# This is, so far, only supported by the BDB backend
AT_SETUP([INDEXED files under ASCII/EBCDIC collation])
AT_SETUP([INDEXED file under ASCII/EBCDIC collation])
AT_KEYWORDS([runfile WRITE DELETE READ EBCDIC])

AT_SKIP_IF([test "$COB_HAS_ISAM" != "db"])
Expand Down Expand Up @@ -12685,13 +12685,15 @@ CCC 888 +0000000043 1
DONE
])

# Note: `-Wno-others` is for ignored COLLATIONS for non-alphanumeric keys

# Testing ASCII file collating sequence using clause
AT_DATA([prog1.cob], [
COPY "prog.cpy" REPLACING
==FILE-COLSEQ== BY ==COLLATING SEQUENCE IS ASCII==
==KEY-COLSEQ== BY ====.
])
AT_CHECK([$COMPILE -Wno-unfinished prog1.cob], [0], [], [])
AT_CHECK([$COMPILE -Wno-unfinished -Wno-others prog1.cob], [0], [], [])
AT_CHECK([$COBCRUN_DIRECT ./prog1 1>prog1.out], [0], [], [])
AT_CHECK([diff reference_ascii prog1.out], [0], [], [])

Expand All @@ -12712,7 +12714,7 @@ AT_DATA([prog3.cob], [
==FILE-COLSEQ== BY ==COLLATING SEQUENCE IS ASCII==
==KEY-COLSEQ== BY ==COLLATING SEQUENCE OF MY-AKEY1 IS EBCDIC==.
])
AT_CHECK([$COMPILE -Wno-unfinished prog3.cob], [0], [], [])
AT_CHECK([$COMPILE -Wno-unfinished -Wno-others prog3.cob], [0], [], [])
AT_CHECK([$COBCRUN_DIRECT ./prog3 1>prog3.out], [0], [], [])
AT_CHECK([diff reference_ascii_ebcdic prog3.out], [0], [], [])

Expand All @@ -12732,7 +12734,7 @@ AT_DATA([prog5.cob], [
==FILE-COLSEQ== BY ==COLLATING SEQUENCE IS EBCDIC==
==KEY-COLSEQ== BY ==COLLATING SEQUENCE OF MY-AKEY1 IS EBCDIC==.
])
AT_CHECK([$COMPILE -Wno-unfinished prog5.cob], [0], [], [])
AT_CHECK([$COMPILE -Wno-unfinished -Wno-others prog5.cob], [0], [], [])
AT_CHECK([$COBCRUN_DIRECT ./prog5 1>prog5.out], [0], [], [])
AT_CHECK([diff reference_ebcdic prog5.out], [0], [], [])

Expand All @@ -12742,7 +12744,7 @@ AT_DATA([prog6.cob], [
==FILE-COLSEQ== BY ====
==KEY-COLSEQ== BY ====.
])
AT_CHECK([$COMPILE -Wno-unfinished -fdefault-file-colseq=EBCDIC prog6.cob], [0], [], [])
AT_CHECK([$COMPILE -Wno-unfinished -Wno-others -fdefault-file-colseq=EBCDIC prog6.cob], [0], [], [])
AT_CHECK([$COBCRUN_DIRECT ./prog6 1>prog6.out], [0], [], [])
AT_CHECK([diff reference_ebcdic prog6.out], [0], [], [])

Expand All @@ -12752,7 +12754,7 @@ AT_DATA([prog7.cob], [
==FILE-COLSEQ== BY ==COLLATING SEQUENCE IS EBCDIC==
==KEY-COLSEQ== BY ==COLLATING SEQUENCE OF MY-AKEY1 IS ASCII==.
])
AT_CHECK([$COMPILE -Wno-unfinished prog7.cob], [0], [], [])
AT_CHECK([$COMPILE -Wno-unfinished -Wno-others prog7.cob], [0], [], [])
AT_CHECK([$COBCRUN_DIRECT ./prog7 1>prog7.out], [0], [], [])
AT_CHECK([diff reference_ebcdic_ascii prog7.out], [0], [], [])

Expand All @@ -12762,13 +12764,156 @@ AT_DATA([prog8.cob], [
==FILE-COLSEQ== BY ====
==KEY-COLSEQ== BY ==COLLATING SEQUENCE OF MY-AKEY1 IS ASCII==.
])
AT_CHECK([$COMPILE -Wno-unfinished -fdefault-file-colseq=EBCDIC prog8.cob], [0], [], [])
AT_CHECK([$COMPILE -Wno-unfinished -Wno-others -fdefault-file-colseq=EBCDIC prog8.cob], [0], [], [])
AT_CHECK([$COBCRUN_DIRECT ./prog8 1>prog8.out], [0], [], [])
AT_CHECK([diff reference_ebcdic_ascii prog8.out], [0], [], [])

AT_CLEANUP


AT_SETUP([INDEXED file with collation on group key])
AT_KEYWORDS([runfile WRITE READ EBCDIC])

AT_SKIP_IF([test "$COB_HAS_ISAM" = "no"])

# This is, so far, only supported by the BDB backend
AT_XFAIL_IF([test "$COB_HAS_ISAM" != "db"])

AT_DATA([prog.cob], [
IDENTIFICATION DIVISION.
PROGRAM-ID. prog.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT MY-FILE ASSIGN TO "testfile"
ORGANIZATION IS INDEXED
ACCESS IS DYNAMIC
RECORD KEY IS MY-KEY.
DATA DIVISION.
FILE SECTION.
FD MY-FILE.
01 MY-REC.
05 MY-KEY.
10 MY-KEY-1 PIC X.
10 MY-KEY-2 PIC X.
05 MY-DATA PIC 9.
PROCEDURE DIVISION.

OPEN OUTPUT MY-FILE
MOVE "111" TO MY-REC WRITE MY-REC
MOVE "AA2" TO MY-REC WRITE MY-REC
MOVE "223" TO MY-REC WRITE MY-REC
MOVE "BB4" TO MY-REC WRITE MY-REC
MOVE "335" TO MY-REC WRITE MY-REC
MOVE "CC6" TO MY-REC WRITE MY-REC
MOVE "447" TO MY-REC WRITE MY-REC
MOVE "DD8" TO MY-REC WRITE MY-REC
CLOSE MY-FILE

OPEN INPUT MY-FILE
MOVE LOW-VALUES TO MY-KEY
START MY-FILE KEY >= MY-KEY
INVALID KEY
DISPLAY "INVALID KEY"
NOT INVALID KEY
PERFORM UNTIL EXIT
READ MY-FILE NEXT
AT END
EXIT PERFORM
NOT AT END
DISPLAY MY-REC
END-READ
END-PERFORM
END-START.
CLOSE MY-FILE

STOP RUN.
])

AT_DATA([expout],
[ASCII:
111
223
335
447
AA2
BB4
CC6
DD8
EBCDIC:
AA2
BB4
CC6
DD8
111
223
335
447
])

# Note: ignore any unfinished warning as the test is about the runtime behavior:
AT_CHECK([$COMPILE -Wno-unfinished -fdefault-file-colseq=ASCII prog.cob -o ascii], [0])
AT_CHECK([$COMPILE -Wno-unfinished -fdefault-file-colseq=EBCDIC prog.cob -o ebcdic], [0])

AT_CHECK([
echo "ASCII:" && $COBCRUN_DIRECT ./ascii && \
echo "EBCDIC:" && $COBCRUN_DIRECT ./ebcdic
], [0], [expout]) # <- compare stdout with existing `expout`

AT_CLEANUP


# Note: codegen only for now
AT_SETUP([INDEXED file with NATIONAL collation])
AT_KEYWORDS([runfile])

AT_SKIP_IF([test "$COB_HAS_ISAM" = "no"])

AT_DATA([prog.cob], [
IDENTIFICATION DIVISION.
PROGRAM-ID. prog.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT MY-FILE ASSIGN TO "testfile"
ORGANIZATION IS INDEXED
ACCESS IS DYNAMIC
RECORD KEY IS MY-KEY
COLLATING SEQUENCE FOR NATIONAL IS ASCII.
DATA DIVISION.
FILE SECTION.
FD MY-FILE.
01 MY-REC.
05 MY-KEY PIC N.
05 MY-DATA PIC 9.
PROCEDURE DIVISION.

OPEN OUTPUT MY-FILE
MOVE "11" TO MY-REC WRITE MY-REC
MOVE "A2" TO MY-REC WRITE MY-REC
MOVE "23" TO MY-REC WRITE MY-REC
MOVE "B4" TO MY-REC WRITE MY-REC
MOVE "35" TO MY-REC WRITE MY-REC
MOVE "C6" TO MY-REC WRITE MY-REC
MOVE "47" TO MY-REC WRITE MY-REC
MOVE "D8" TO MY-REC WRITE MY-REC
CLOSE MY-FILE

STOP RUN.
])

AT_DATA([expout], [])

# Note: ignore any unfinished warning as the test is about the runtime behavior:
AT_CHECK([$COMPILE -Wno-unfinished prog.cob], [0], [],
[prog.cob:11: warning: NATIONAL COLLATING SEQUENCE is not implemented
])

AT_CHECK([$COBCRUN_DIRECT ./prog], [0], [expout])

AT_CLEANUP


AT_SETUP([INDEXED file numeric keys ordering])
AT_KEYWORDS([runfile])

Expand Down
Loading
Loading