Working with Salesforce Graph Engine

Before you learn about Salesforce Graph Engine, review how code paths are structured. Each code path contains these three elements.

Source: An entry point for an external interaction.
Sink: Code that modifies data.
Sanitizer: The check that happens between source and sink to ensure that the user who performs this action on the data has the necessary access to the object and fields.

A code path must have a sanitizer in between the source and the sink. When the sanitizer is missing, Graph Engine returns a violation. To avoid violations, ensure that each path created from any source to sink is sanitized.

A source can lead to multiple sinks. Also, a sink can be reached through multiple sources. In fact, we can have multiple paths between the same source and sink.

To invoke data-flow-based rules through Code Analyzer, run the sf scanner run dfa command.

An individual row in a Graph Engine results file represents a violation. Each violation contains sink and source information, plus the actual violation message.

Here’s a breakdown of the output you see by column name.

Rule: The rule that was run which led to the violation.
Severity: The severity of the violation. By default, all security violations are marked as severity 1.
Message: The violation message. For more about Graph Engine violation messages, see our FAQ.
Category:
URL:
Sink Line, Sink Column, Sink File Name: The location where the data interaction happens in your source code.
Source Line, Source Column: The location where the path begins.
Source Type, Source MethodName: Additional information to help identify the path entry.

Like all security tools, Graph Engine can create false negatives or false positives. For example, the engine can fail to create a violation where the code is insecure, which is a false negative. Or it can create a violation even though the code is secure, a false positive.

If you determine that Graph Engine created a false positive, add engine directives to your code so that Graph Engine doesn’t throw that violation anymore.

Graph Engine understands three levels of engine directives.

Disable Next Line
Disable Method
Disable Line

To disable just the sink from Graph Engine’s analysis, run disable-next-line. For example:

/* sfge-disable-next-line <rule_name> */

To disable just the sink from Graph Engine’s analysis, run disable-next-line. For example:

To disable all the sink operations in paths passing through this method, use disable-stack. As with the other engine directives, make sure that you add it in the line immediately before the method declaration. For example:

/* sfge-disable <rule_name> */

To disable all the sink operations that occur in the class, run disable. As with other engine directives, add it in the line immediately before the class declaration. For example:

If your Graph Engine analysis is intentionally blocked, it’s because Graph Engine identified something incorrect in your code. You must modify your code to unblock the analysis. Depending on the situation, you see one of these messages.

Message	Violation	When it Occurs
Remove unreachable code to proceed with the analysis.	User Action	Returned one time on an entire analysis. Analysis of all code is blocked.
Rename or delete this reused variable to proceed with the analysis.	User Action Violation	Returned on a single path. Analysis of only that code path is blocked.

This code example produces multiple actions, such as a throw statement followed by a return statement.

A Graph Engine analysis attempt on this code results in the entire analysis being blocked and the User Action message is returned: Remove unreachable code to proceed with the analysis.

This code example reuses a variable, String input, in the same scope of a method.

A Graph Engine analysis attempt on this code path results in a User Action Violation on this path. Analysis on other paths can proceed. Sometimes other violations are returned. This message is returned: Rename or delete this reused variable to proceed with the analysis.

When Graph Engine analyzes highly complex code, it runs out of heap space, which results in a LimitReached error. To decrease the occurrence of LimitReached errors and to complete as much analysis as possible within a shorter period, we added processing limits on Graph Engine. These limits help Graph Engine to fail fast when a path’s analysis is approaching a LimitReached error. This fail-fast process includes preemptively aborting a path analysis when Graph Engine encounters a path that’s too complex.

To proactively reduce the chances of a LimitReached error in your scans, take these steps.

Execute scanner run dfa with default heap space settings and collect the results to a file using the --outfile parameter. The output file contains the majority of the actionable items.
Filter your output file on the LimitReached violation and group these violations into sets of targets. LimitReached violations are the more complex paths that need more heap space and time.
To determine your previous execution’s path expansion limit and maximum heap space allocated, search for this string in /<home>/.sfdx-scanner/sfge.log: “Path expansion limit”–You use these values later to control the complexity that Graph Engine can handle for your code.
Execute scanner run dfa on each LimitReached target grouping that you created.
Run scanner run dfa iteratively with larger memory allocation each time to exclusively target complex areas.

Sample command allocating max heap space of 20G:

sf scanner run dfa --projectdir /path/to/full/project --target /path/to/a/source/file#optionalSpecificEntryMethod --sfgejvmargs -Xmx20g --outfile result_2.csv

To optimize your LimitReached scans, follow these recommendations.

Use individual file names in --target parameter or names specific to the target method.
Use the --sfgejvmargs parameter to define a larger heap space than the default.

If the same target row repeatedly reaches the limit, follow these steps.

Remove the upper limit by passing in --pathexplimit -1.
Decrease the number of parallel threads by setting the --rule-thread-count parameter to 2.
Increase timeout by setting the --rule-thread-timeout parameter to 300000 ms.

Two Graph Engine parameters, --sfgejvmargs and --pathexplimit, act as knobs that turn the max heap size and the complexity of Graph Engine scans up or down. Use these knobs to fine-tune your code’s analysis and rate of OutOfMemory occurrences.

Use the --sfgejvmargs parameter to modify your Java Virtual Machine (JVM) default max heap size.

Look up your JVM -Xmx value, which is your allocated heap size.
Use the --sfgejvmargs parameter to increase your -Xmx value on scanner run dfa command.
Execute Graph Engine with a larger heap space than the default settings.

For example, to allocate 2G heap space:

To maximize your heap space balance with Graph Engine performance, follow these recommendations.

Because the heap space value depends on the complexity of the target codebase, there’s no magic number. A very large heap space can degrade Graph Engine’s performance, so increase the heap space allocation in increments of 1 G. Experiment to see what works for your project.
Target a smaller set of files for analysis. Provide a subset of Apex files using the --target flag on the scanner run dfacommand while keeping the same --projectdir value. This approach reduces the number of paths and reduces the likelihood of OutOfMemory errors.
To avoid large IF/ELSE-IF/ELSE conditional trees, simplify your code, which helps bring down the number of paths created.

Heap space allocated for a scanner run dfa execution also dictates how much complexity Graph Engine can handle. If you ran our recommended steps earlier, grab the path expansion limit that you looked up.

Override your path expansion limit using the --pathexplimit parameter. Or remove the limit by passing in this value as -1.

To find more information about path expansion limits, refer to the OutOfMemory Error section in the FAQ.

Graph Engine has these limitations.

Violations thrown as Internal error. Work in progress. Please ignore indicate that the entry point’s analysis didn’t complete successfully. We’re working on fixing this issue. In the meantime, you must verify the validity of this error manually.
Graph Engine handles unique class names. If the source code has two distinctly different files that have classes with duplicate names, Graph Engine fails with an error message: <example_class> is defined in multiple files. In cases like these, provide a --projectdir subpath to the source directory that has only one of the file names, and rerun Graph Engine with the subpath to the second file name.
Graph Engine doesn’t handle anonymous Apex script. Provide the class directory path as the --projectdir that doesn’t include any anonymous Apex script.
Graph Engine doesn’t handle namespace placeholders. Leave the namespace placeholder blank.
Graph Engine supports Apex property chains with a depth of 2 or fewer. For example, Graph Engine supports Object.x but not Object.x.y.
Graph Engine doesn’t scan triggers.

We appreciate your help in identifying and fixing issues with Salesforce Graph Engine. To report bugs, create a new issue.

To verify your bug, include publicly shareable sample code.

Create sample code without actual variable names that still mimics the original issue as closely as possible.
Ensure that your sample code runs into the same error as your original code.

If you have thoughts on usability, how the tool works, or new feature requests, we welcome your feedback.