Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] CodeAnalyzerDFA is failing with timeout error: InternalExecutionError, Path evaluation timed out after 18000000 #1186

Open
acc-akar opened this issue Sep 20, 2023 · 20 comments
Labels
BUG P3 Rarely Malfunction SFGE Issues related to the Salesforce Graph Engine

Comments

@acc-akar
Copy link

acc-akar commented Sep 20, 2023

Describe the bug
Getting error while scanning DFA:
Graph Engine reached the path expansion upper limit (8360). The analysis preemptively stopped running on this path to prevent an OutOfMemory error. Rerun Graph Engine and target this entry method with a larger heap space.

To Reproduce
Run command: sfdx scanner:run:dfa --sfgejvmargs "-Xmx20g" --format=csv --outfile=CodeAnalyzerDFA.csv --target="./" --projectdir="./" --category="Security"

**Current behavior
Getting InternalExecutionError with above error.

Expected behavior
Expecting scanning result to pass with 0 violation, but the scan terminates.

Additional context
Attaching log file for review.
sfge.log

"Urgency": Impacting business, need to submit package for security review.

@acc-akar acc-akar changed the title Graph Engine reached the path expansion upper limit (8360), OutOfMemory error. Rerun Graph Engine and target this entry method with a larger heap space InternalExecutionError: Graph Engine reached the path expansion upper limit (8360), OutOfMemory error. Rerun Graph Engine and target this entry method with a larger heap space Sep 20, 2023
@jfeingold35
Copy link
Collaborator

@abhishekkarns , please see our documentation on this matter.

@acc-akar
Copy link
Author

acc-akar commented Sep 21, 2023

We tried everything suggested:

  • Increased heap size
  • Set pathexplimit to -1
  • Increased timeout to 30 minutes
  • Ran classes indiviually

Running classes indivually then only few classes are failing that is either batch class or class having lots of logic and using helper classes. Helper classess scaning without any issue but when running Parent class getting InternalExecutionError.

We are doing callout in helper class, is that causing any issue? Do we need to have any specific way to handle callouts or future methods for scanner?

Please have a look and advise!

@jfeingold35
Copy link
Collaborator

What specific InternalExecutionError are you still getting when you run with just the parent class?

@acc-akar
Copy link
Author

Error:
"4","3","D:\Netsutra\Vs Project\MUSTER\Muster Advocacy App\force-app\main\default\classes\MusterContactBatch.cls","100","17","MusterContactBatch","execute","","","","InternalExecutionError","Path evaluation timed out after 1800000 ms","https://forcedotcom.github.io/sfdx-scanner/en/v3.x/salesforce-graph-engine/rules/#ApexFlsViolationRule","InternalExecutionError","sfge"

I believe it must be something related to callout and future call that is happening in helper class but helper class is scanner without any issue indivisually?

@jfeingold35
Copy link
Collaborator

Have you tried using a greater timeout than 30 minutes? It sounds like your code is pretty complicated, so it's possible that you might just need to give it more time.

@acc-akar
Copy link
Author

Increased timeout to 2.5 hours but still getting same error:
"4","3","D:\Netsutra\Vs Project\MUSTER\Muster Advocacy App\force-app\main\default\classes\MusterSettingsController.cls","20","31","MusterContactBatch","getMusterData","","","","InternalExecutionError","Path evaluation timed out after 9000000 ms","https://forcedotcom.github.io/sfdx-scanner/en/v3.x/salesforce-graph-engine/rules/#ApexFlsViolationRule","InternalExecutionError","sfge"

Do we need to handle callout and future call in some specific manner for DFA scanner?

@jfeingold35
Copy link
Collaborator

My recollection is that you shouldn't have to do any kind of special handling for those. If you send me a sample of what the callouts/future calls being invoked are and how they're being invoked, I can do some testing. But my suspicion is that you might just need to increase the timeout.

@acc-akar
Copy link
Author

acc-akar commented Sep 25, 2023

@jfeingold35 - Per suggestion, we tried running the scan with increased timeout. The scan even failed when timeout set as 5 hours, also tried a couple of times but it did not went through and still getting same error. Not sure if the scanner is getting stuck somewhere.

Do you need the complete code to execute or you want to see the code and run scanner to debug it? Since the code has dependencies on custom objects, etc. it may not be possible to share executable code.

However, I (and my team) can walk you through the code in a working session if we could connect over a meeting.
Let me know if any of these following time slot works for you, and you may join using below meeting link:

  • Monday, 9/25 10AM - 12AM EST
  • Tuesday, 9/26 8.30AM - 12AM EST

Alternatively, you may suggest a convenient time for us to connect.

Join Zoom Meeting
https://agilecloudconsulting.zoom.us/j/84219299710?pwd=VytLVEpNUXBHcEFTNEFOYjAxbEo2Zz09

Meeting ID: 842 1929 9710
Passcode: 531610

Thanks in advance!

sfge (2).log

@acc-akar
Copy link
Author

acc-akar commented Sep 25, 2023

@jfeingold35 - We have shared an invite of the GIT code repository with you to further review. And, let me know if you need access to the sandbox org to test the code.

Scan failing for these classes:
MusterSettingsController
MusterContactBatch
MusterContactAccountPost_Batch
MusterCheckBatchStatus_Batch
MusterActionBatch

@acc-akar
Copy link
Author

@jfeingold35 - Hope you are doing well!
Let me know if you got a chance to review the issue and have any updates/suggestions.

@jfeingold35
Copy link
Collaborator

@abhishekkarns , sorry, yesterday was Yom Kippur, so I spent all day in shul.
If I have a chance to look at this, I'll certainly do so. That said, our team's bandwidth is very limited right now, so I truly can't make any promises.
In the meantime, please try running without --pathexplimit -1. This will help us determine whether it's spending 5 hours expanding a ludicrous number of paths or if it's simply getting stuck somewhere.

@acc-akar
Copy link
Author

acc-akar commented Sep 27, 2023

@jfeingold35 Removed --pathexplimit -1 from command and ran for timeout 18000000, now getting OutOfMemory error, asking to run with a larger heap space, see error below:

"4","3","D:\Muster\Muster Advocacy App\force-app\main\default\classes\MusterSettingsController.cls","20","31","MusterSettingsController","getMusterData","","","","LimitReached","Graph Engine reached the path expansion upper limit (7431). The
analysis preemptively stopped running on this path to prevent an OutOfMemory error. Rerun Graph Engine and target this entry method with a larger heap space.","https://forcedotcom.github.io/sfdx-scanner/en/v3.x/salesforce-graph-engine/working-with-sfge/#understand-outofmemory-errors","InternalExecutionError","sfge"

So without --pathexplimit -1 getting OutOfMemory error and with --pathexplimit -1 getting timed out error even for 5 hours.

Please advise.

@jfeingold35
Copy link
Collaborator

@abhishekkarns , okay. That does seem to support the notion that it's just a huge number of paths being expanded, and that sort of thing takes a while. You might just need to keep increasing the timeout until it passes.
I'm confident that an HTTP callout wouldn't be breaking the scanner, but if you're concerned about the future methods, you can always add the following snippet immediately before the invocation:

String s = nulll;
System.debug(s.length());

This will force a null pointer exception, which halts further expansion of the path in question. If adding that before your future-call resolves this, that at least confirms that it's related.

@acc-akar
Copy link
Author

@jfeingold35 - We have already tried running the scan with 5 hours as timeout, and per suggestion, we added the code snippet before the future invocation but it still failed with below timeout error:

"4","3","D:\Muster\Muster Advocacy App\force-app\main\default\classes\MusterSettingsController.cls","695","26","MusterSettingsController","syncBatchBasedOnDirection","","","","InternalExecutionError","Path evaluation timed out after 18000000 ms","https://forcedotcom.github.io/sfdx-scanner/en/v3.x/salesforce-graph-engine/rules/#ApexFlsViolationRule","InternalExecutionError","sfge"

I truly appreciate your time in helping resolving the issue.
We are in a bit of tough situation. Since last month a clean report from this tool is a requirement to submit for security review.
We are getting clean report for "CheckMarx" and "CodeAnalyzerGeneral", but unable to get the clean "CodeAnalyzerDFA" and this is holding up the security review submission.
What else we can do or provide you to help expediate resolve the issue?

Please suggest.

@acc-akar acc-akar changed the title InternalExecutionError: Graph Engine reached the path expansion upper limit (8360), OutOfMemory error. Rerun Graph Engine and target this entry method with a larger heap space CodeAnalyzerDFA is failing with timeout error: InternalExecutionError, Path evaluation timed out after 18000000 Sep 28, 2023
@jfeingold35
Copy link
Collaborator

@abhishekkarns , I'll start with the good news: a timeout error like what you're seeing is not a blocker for security review submission. You can always document it as a false positive and submit.

The bad news is that if injecting the null pointer exception before the future-call didn't resolve the timeout, then it means the future call isn't what's causing the timeout. It really just sounds like the code is complex, and it's taking a while to scan. And in that case, there's really only a few things you can do.

  1. You can try increasing the timeout until the run succeeds. If that means increasing it to something longer than 5 hours, then you wouldn't be the first.
  2. You can try refactoring your code to make it easier for the scanner to evaluate. Per the documentation, Graph Engine identifies path entry points, and expands paths flowing out of those points. Every time path expansion encounters a control flow branch (e.g., an if-else, try-catch, a loop, etc), it forks a new path. This is where most of the runtime comes from. Therefore, the most valuable kinds of refactor are:
    a. Refactors that strictly decrease the total number of paths in the codebase. Removing an unnecessary if early in a method removes not only that one path, but all the paths that fork off from that path. So small refactors can have big impacts.
    b. Refactors that more evenly distribute paths between entrypoints. Entrypoints are evaluated in parallel, and the timeout applies to each entrypoint individually (e.g., a 5-hour timeout means each entrypoint has 5 hours to complete). So two entrypoints with 50 paths each will evaluate faster than a single entrypoint with 100 paths, as illustrated below.
	/**
	 * This entrypoint evaluates slowly because `b`'s value is unknown, meaning there are two branches.
	 */
	global void combinedEntrypoint(boolean b) {
		if (b) {
			// ...
		} else {
			// ...
		}
	}

	/**
	 * This entrypoint evaluates quick because it passes a literal into the helper method and therefore avoids branching.
	 */
	global void splitEntrypointA() {
		// Literal value passed into helper method.
		helperMethod(true);
	}

	/**
	 * This entrypoint evaluates quick because it passes a literal into the helper method and therefore avoids branching.
	 */
	global void splitEntrypointB() {
		// Literal value passed into helper method.
		helperMethod(false);
	}

	/**
	 * This method is identical to `combinedEntrypoint`.
	 */
	public void helperMethod(boolean b) {
		if (b) {
			// ...
		} else {
			// ...
		}
	}

@acc-akar
Copy link
Author

We are back :)

We refactored our code to reduce number of IF-ELSE and move them at lower levels and tried to identify what piece of code is causing the timeout error, but no success. What works independently, does not work when called by another class!
Attached is simplified version of code to try to help identify root cause of the timeout problem.

Running command for path limit -1 and timeout 18000000, still not working and getting timeout error: InternalExecutionError, Path evaluation timed out after 18000000
Command:
sfdx scanner:run:dfa --sfgejvmargs "-Xmx20g" --target './force-app/main/default/classes/MusterSettingsController.cls' --projectdir="./" --category="Security" --format csv --pathexplimit -1 --rule-thread-timeout 18000000

In sfge log file seems it stuck at below lines of MusterCustomHandler class.
resStatus =MainList[0].resStatusCode;
objList = MainList[0].customWpr;

From the log, looks like scanner is iterating on something and getting into infinite loop !!
Scanner not actually running the code, right?
Can you please help?

Debug Log.txt
Sample Code.txt

@acc-akar
Copy link
Author

Hi,

Today we did bit of more digging, looks like getting value from 0th index of a list is causing the problem -

if(MainList != Null && MainList.size() > 0) {
resStatus =MainList[0].resStatusCode;
resMessage = MainList[0].resMessage;

Instead of getting value from list's 0th index, changed the code to return object and set values as below:

if(MainList != Null) {
resStatus = MainList.resStatusCode;
resMessage = MainList.resMessage;
objList = MainList.customWpr;
}

Scanning the class took around 5 hours but succeeded with no violations found!

Does this makes sense?
What is your opinion on this? We have simplified the code a lot, to find the root cause of the error but not sure if we found it !

@jfeingold35
Copy link
Collaborator

@acc-akar , I'm able to reproduce this with the code you provided, but I'm still working on determining a precise cause. In the meantime, if you've already discovered a refactor you can do that unblock the analysis, I would strongly encourage you to continue using that workaround for now.

@johnbelosf johnbelosf changed the title CodeAnalyzerDFA is failing with timeout error: InternalExecutionError, Path evaluation timed out after 18000000 [BUG] CodeAnalyzerDFA is failing with timeout error: InternalExecutionError, Path evaluation timed out after 18000000 Feb 20, 2024
@johnbelosf johnbelosf added the BUG P3 Rarely Malfunction label Feb 20, 2024
Copy link

git2gus bot commented Feb 20, 2024

This issue has been linked to a new work item: W-15080470

@rmohan20
Copy link
Contributor

Duplicate of #1294

@rmohan20 rmohan20 marked this as a duplicate of #1294 Mar 11, 2024
@stephen-carter-at-sf stephen-carter-at-sf added the SFGE Issues related to the Salesforce Graph Engine label May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG P3 Rarely Malfunction SFGE Issues related to the Salesforce Graph Engine
Projects
None yet
Development

No branches or pull requests

5 participants