Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation multi pattern regular expressions #2161

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

biathlon3
Copy link
Contributor

No description provided.

kingluo added 20 commits May 25, 2024 17:29
Problems:

1. In the new kernel, assembly functions uniformly return from
   `__x86_return_thunk`. However, our assembly code uses the original
   `ret` instruction, so objtool in the kernel will notice this is a naked
   return during compilation.

2. `SYM_FUNC_START` in the new kernel will add endbr64 to the head of
   the assembly function, and all indirect jumps to ENDBR instructions,
   that is, the code snippet within the same function, will fail, but we
   use jump tables in the assembly function to perform indirect jumps. It
   will raise CET exception:
   https://en.wikipedia.org/wiki/X86_instruction_listings#Added_with_Intel_CET).

Solutions:

1. Substitute the `ret` with `RET`, a macro in the new kernel to
   ensure the correct return.

2. `notrack jmp` and enable notrack in CPU setting:
   `wrmsrl(MSR_IA32_S_CET, CET_ENDBR_EN | CET_NO_TRACK_EN)`

As an aside, interestingly, if a user-mode C program uses a switch
statement that meets the conditions for generating a jump table (gcc
uses `-fcf-protection=full` by default), the generated jump table will
use a `jmp` with the `notrack` prefix, and IBT will be marked as `true`
in the `.note.gnu.property` section of the compiled elf file, so that
the `NO_TRACK_EN` of the `MSR` will be set to `true` in user mode when
the kernel is loaded. So user mode can use `notrack` to bypass CET
without caring about setting or not setting `NO_TRACK_EN`.
@biathlon3 biathlon3 linked an issue Jul 5, 2024 that may be closed by this pull request
@biathlon3 biathlon3 marked this pull request as draft July 5, 2024 14:08
@biathlon3
Copy link
Contributor Author

For deployment see file install.txt. Description will be expanded.

@biathlon3
Copy link
Contributor Author

This PR works on Kernel 6.8.9

Description of using hscollider - http://intel.github.io/hyperscan/dev-reference/tools.html

Description of regular expression syntax - https://perldoc.perl.org/perlre
It is supported in Hyperscan not fully, constraints described here - http://intel.github.io/hyperscan/dev-reference/compilation.html

Now regex implemented in tempasta config as fallows:
for locations, added keyword "regex" and regex started with "^";
for httptables, regex started with "^".

for example:

location regex "^/new/" {
    frang_limits {
	http_body_len 5;
	http_strict_host_checking true;
    }
}

http_chain {
  uri == "^/html/" -> default;
  -> default;
}

@biathlon3 biathlon3 force-pushed the ag_Multi-pattern-regular-expressions branch from 9082e54 to 5c02ff5 Compare July 8, 2024 06:04
@biathlon3 biathlon3 force-pushed the ag_Multi-pattern-regular-expressions branch from 5c02ff5 to 8c64bfa Compare July 8, 2024 17:09
@biathlon3 biathlon3 force-pushed the ag_Multi-pattern-regular-expressions branch 2 times, most recently from 4f4608e to d932ab5 Compare July 25, 2024 11:13
@biathlon3 biathlon3 marked this pull request as ready for review July 26, 2024 13:24
@biathlon3
Copy link
Contributor Author

I squashed it into one commit so it can be reviewed.

@biathlon3 biathlon3 force-pushed the ag_Multi-pattern-regular-expressions branch from d932ab5 to 5f2b864 Compare July 29, 2024 16:35
Made regex configuration same way as in Nginx.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-pattern regular expressions
2 participants