CPD

Use the Copy/Paste Detector (CPD) engine to identify blocks of duplication across files written in various languages.

When you run Code Analyzer in your project and specify this engine, any violations include the files in which the duplication was found, along with the location of the duplicate blocks. We recommend that you use the --view detail flag which makes it easier to review the output. For example:

Check the locations section of the violation output for the files that contain the duplication. For example:

Code Analyzer supports these languages associated with CPD: Apex, HTML, JavaScript, TypeScript, Visualforce, and XML.

Run this command to view detailed information about the CPD rules for all available languages:

For information on how to modify rule settings, such as their severity or tags, see Customize Your Configuration. While the examples show modifying rules for the ESLint and Regex engines, you use the same process to modify CPD rules.

FieldTypeDescription
disable_enginebooleanWhether to turn off the 'cpd' engine so that it is not included when running Code Analyzer commands. Default value is false.
java_commandstringIndicates the specific 'java' command associated with the JRE or JDK to use for the 'cpd' engine. May be provided as the name of a command that exists on the path, or an absolute file path location. If unspecified, or specified as null, then an attempt will be made to automatically discover a 'java' command from your environment. Default value is null.
file_extensionsobjectSpecifies the list of file extensions to associate to each rule language. The rule(s) associated with a given language will run against all the files in your workspace containing one of the specified file extensions. Each file extension can only be associated to one language. If a specific language is not specified, then a set of default file extensions for that language will be used. Default value is {"apex": [".cls", ".trigger"], "html": [".html", ".htm", ".xhtml", ".xht", ".shtml", ".cmp"], "javascript": [".js", ".cjs", ".mjs"], "typescript": [".ts"], "visualforce": [".page", ".component"], "xml": [".xml"]}.
minimum_tokensobjectSpecifies the minimum tokens threshold for each rule language. The minimum tokens threshold is the number of tokens required to be in a duplicate block of code in order to be reported as a violation. The concept of a token may be defined differently per language, but in general it is a distinct basic element of source code. For example, this could be language specific keywords, identifiers, operators, literals, and more. See https://docs.pmd-code.org/latest/pmd_userdocs_cpd.html to learn more. If a value for a language is unspecified, then the default value of 100 will be used for that language. Default value is {"apex": 100, "html": 100, "javascript": 100, "typescript": 100, "visualforce": 100, "xml": 100}.
skip_duplicate_filesbooleanIndicates whether to ignore multiple copies of files of the same name and length. Default value is false.