credsweeper.filters package

Submodules

credsweeper.filters.filter module

class credsweeper.filters.filter.Filter(config: Config | None, *args)[source]

Bases: ABC

Base class for all filters that operates on ‘line_data’ objects.

abstract run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.line_git_binary_check module

class credsweeper.filters.line_git_binary_check.LineGitBinaryCheck(config: Config | None = None)[source]

Bases: Filter

Checks that line is not a part of git binary patch

base85string = re.compile('^[A-Za-z][0-9A-Za-z!#$%&()*+;<=>?@^_`{|}~-]{6,65}$')

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.line_specific_key_check module

class credsweeper.filters.line_specific_key_check.LineSpecificKeyCheck(config: Config | None = None)[source]

Bases: Filter

Check that values from list below is not in candidate line.

NOT_ALLOWED = ['example', '\\benc[\\(\\[]', '\\btrue\\b', '\\bfalse\\b']

NOT_ALLOWED_PATTERN = re.compile('(?:example|\\benc[\\(\\[]|\\btrue\\b|\\bfalse\\b)', re.IGNORECASE)

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.line_uue_part_check module

class credsweeper.filters.line_uue_part_check.LineUUEPartCheck(config: Config | None = None)[source]

Bases: Filter

Checks that line is not a part of UU encoding only for maximal line

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

uue_string = re.compile('^M[!-`]{60}$')

credsweeper.filters.value_allowlist_check module

class credsweeper.filters.value_allowlist_check.ValueAllowlistCheck(config: Config | None = None)[source]

Bases: Filter

Check that the patterns do not MATCH the candidate value.

ALLOWED = ['ENC\$.*\$', 'ENC\\[.*\\]', '\\$\\{(\\*|[0-9]+|[a-z_].*)\\}', '\\$[0-9]+(\\s|$)', '\\$\\$[a-z_]+(\\^%[0-9a-z_]+)?', '#\\{.+\\}', '\\{\\{.+\\}\\}', '.*@@@hl@@@(암호|비번|PW|PASS)@@@endhl@@@']

ALLOWED_PATTERN = re.compile('(?:ENC\$.*\$|ENC\\[.*\\]|\\$\\{(\\*|[0-9]+|[a-z_].*)\\}|\\$[0-9]+(\\s|$)|\\$\\$[a-z_]+(\\^%[0-9a-z_]+)?|#\\{.+\\}|\\{\\{.+\\}\\}|.*@@@hl@@@(암호|비번|PW|PASS)@@@endhl@@@)', re.IGNORECASE)

ALLOWED_QUOTED = ['\\$[a-z_][0-9a-z_]+((::|->|\\.)[a-z_]|\\[|$)', '\\$\$[^)]+\$', '.*\\*\\*\\*']

ALLOWED_QUOTED_PATTERN = re.compile('(?:\\$[a-z_][0-9a-z_]+((::|->|\\.)[a-z_]|\\[|$)|\\$\$[^)]+\$|.*\\*\\*\\*)', re.IGNORECASE)

ALLOWED_UNQUOTED = ['[~a-z0-9_]+((\\.|->)[a-z0-9_]+)+\\(.*$', '\\$[a-z_][0-9a-z_]+((::|->|\\.)[a-z_]|\\[|$)', '\\$\\([.0-9a-z_-]+', '.*\\*\\*\\*\\*\\*']

ALLOWED_UNQUOTED_PATTERN = re.compile('(?:[~a-z0-9_]+((\\.|->)[a-z0-9_]+)+\\(.*$|\\$[a-z_][0-9a-z_]+((::|->|\\.)[a-z_]|\\[|$)|\\$\\([.0-9a-z_-]+|.*\\*\\*\\*\\*\\*)', re.IGNORECASE)

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_array_dictionary_check module

class credsweeper.filters.value_array_dictionary_check.ValueArrayDictionaryCheck(config: Config | None = None)[source]

Bases: Filter

Match call to dictionary or array element.

This filter checks only calls, not declarations:: token = values[i] would be filtered token = {‘root’} would be kept

PATTERN = re.compile('\\[[\'\\"]?[^,]+[\'\\"]?]')

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_atlassian_token_check module

class credsweeper.filters.value_atlassian_token_check.ValueAtlassianTokenCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate have a known structure

static check_atlassian_struct(value: str) → bool[source]: Returns False if value is valid for atlassian structure ‘integer:bytes’

static check_crc32_struct(value: str) → bool[source]: Returns False if value is valid for bitbucket app password structure ‘payload:crc32’

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_azure_token_check module

class credsweeper.filters.value_azure_token_check.ValueAzureTokenCheck(config: Config | None = None)[source]

Bases: Filter

Azure tokens contains header, payload and signature https://learn.microsoft.com/en-us/azure/active-directory-b2c/access-tokens

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_base32_data_check module

class credsweeper.filters.value_base32_data_check.ValueBase32DataCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate is NOT an ascii encoded string with entropy check

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received weird base32 token which must be a random string

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_base64_data_check module

class credsweeper.filters.value_base64_data_check.ValueBase64DataCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate is NOT an ascii encoded string with entropy check

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received weird base64 token which must be a random string

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_base64_encoded_pem_check module

class credsweeper.filters.value_base64_encoded_pem_check.ValueBase64EncodedPem(config: Config | None = None)[source]

Bases: Filter

Check that candidate contains base64 encoded pem private key

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_base64_key_check module

class credsweeper.filters.value_base64_key_check.ValueBase64KeyCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate contains base64 encoded private key

EXTRA_TRANS_TABLE = {34: None, 39: None, 44: None, 92: None}

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_base64_part_check module

class credsweeper.filters.value_base64_part_check.ValueBase64PartCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate is NOT a part of base64 long line

base64_char_set = {'+', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '=', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '\\', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'}

base64_pattern = re.compile('^(\\\\{1,8}[0abfnrtv]|[0-9A-Za-z+/=]){1,4000}$')

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received weird base64 token which must be a random string

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_basic_auth_check module

class credsweeper.filters.value_basic_auth_check.ValueBasicAuthCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate have a known structure

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_blocklist_check module

class credsweeper.filters.value_blocklist_check.ValueBlocklistCheck(config: Config | None = None)[source]

Bases: Filter

Check that words from block list is lest that 70% of candidate value length.

NOT_ALLOWED = ['true', 'false', 'null', 'none', 'bearer', 'string', 'value', 'undefined', 'uuid']

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_camel_case_check module

class credsweeper.filters.value_camel_case_check.ValueCamelCaseCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate is not written in camel case.

CAMEL_CASE = ['[a-z]+([A-Z][a-z]+)+', '[A-Z][a-z]+([A-Z][a-z]+)+']

CAMEL_CASE_PATTERN = re.compile('(?:[a-z]+([A-Z][a-z]+)+|[A-Z][a-z]+([A-Z][a-z]+)+)')

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_dictionary_keyword_check module

class credsweeper.filters.value_dictionary_keyword_check.ValueDictionaryKeywordCheck(config: Config | None = None)[source]

Bases: Filter

Check that no word from dictionary present in the candidate value.

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_discord_bot_check module

class credsweeper.filters.value_discord_bot_check.ValueDiscordBotCheck(config: Config | None = None)[source]

Bases: Filter

Discord bot Token

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_entropy_base32_check module

class credsweeper.filters.value_entropy_base32_check.ValueEntropyBase32Check(config: Config | None = None)[source]

Bases: ValueEntropyBaseCheck

Base32 entropy check

static get_min_data_entropy(x: int) → float[source]: Returns average entropy for size of random data. Precalculated data is applied for speedup

credsweeper.filters.value_entropy_base36_check module

class credsweeper.filters.value_entropy_base36_check.ValueEntropyBase36Check(config: Config | None = None)[source]

Bases: ValueEntropyBaseCheck

Base36 entropy check

static get_min_data_entropy(x: int) → float[source]: Returns minimal entropy for size of random data. Precalculated data is applied for speedup

credsweeper.filters.value_entropy_base64_check module

class credsweeper.filters.value_entropy_base64_check.ValueEntropyBase64Check(config: Config | None = None)[source]

Bases: ValueEntropyBaseCheck

Base64 entropy check

static get_min_data_entropy(x: int) → float[source]: Returns minimal average entropy for size of random data. Precalculated round data is applied for speedup

credsweeper.filters.value_entropy_base_check module

class credsweeper.filters.value_entropy_base_check.ValueEntropyBaseCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate value has minimal Shanon Entropy for appropriated base

abstract static get_min_data_entropy(x: int) → float[source]: Returns minimal entropy for size of data

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_file_path_check module

class credsweeper.filters.value_file_path_check.ValueFilePathCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate value is a path or not.

Check if a value contains either ‘/’ or ‘:’ separators (but not both) and do not have any special characters ( !$@`&*()+)

base64stdpad_possible_set = {'+', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '=', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'}

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

unusual_linux_symbols_in_path = '\t\n\r!@`&*<>+=;,~^:\\'

unusual_windows_symbols_in_path = '\t\n\r!$@`&*(){}<>+=;,~^'

credsweeper.filters.value_github_check module

class credsweeper.filters.value_github_check.ValueGitHubCheck(config: Config | None = None)[source]

Bases: Filter

NPM or GitHub Classic Token validation

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_grafana_check module

class credsweeper.filters.value_grafana_check.ValueGrafanaCheck(config: Config | None = None)[source]

Bases: Filter

Grafana Provisioned API Key and Access Policy Token

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_grafana_service_check module

class credsweeper.filters.value_grafana_service_check.ValueGrafanaServiceCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate have a known structure

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_hex_number_check module

class credsweeper.filters.value_hex_number_check.ValueHexNumberCheck(config: Config | None = None)[source]

Bases: Filter

Check value if it is a value up to 64 bits hex representation

HEX_08_64_VALUE_REGEX = re.compile('^0x[0-9a-f]{1,16}$')

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_jfrog_token_check module

class credsweeper.filters.value_jfrog_token_check.ValueJfrogTokenCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate have a known structure JFROG token

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_json_web_key_check module

class credsweeper.filters.value_json_web_key_check.ValueJsonWebKeyCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate is JWK which starts usually from ‘e’ and have private parts of the key https://datatracker.ietf.org/doc/html/rfc7517 https://datatracker.ietf.org/doc/html/rfc7518

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received key which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_json_web_token_check module

class credsweeper.filters.value_json_web_token_check.ValueJsonWebTokenCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate is JWT which starts usually from ‘eyJ’ registered keys are checked to be in the JWT parts https://www.iana.org/assignments/jose/jose.xhtml

header_keys = {'alg', 'apu', 'apv', 'aud', 'b64', 'crit', 'cty', 'enc', 'epk', 'iss', 'iv', 'jku', 'jwk', 'kid', 'nonce', 'p2c', 'p2s', 'ppt', 'sub', 'svt', 'tag', 'typ', 'url', 'x5c', 'x5t', 'x5t#S256', 'x5u', 'zip'}

payload_keys = {'alg', 'aud', 'crit', 'crv', 'd', 'dp', 'dq', 'e', 'enc', 'exp', 'ext', 'iat', 'id', 'iss', 'jku', 'jti', 'jwk', 'k', 'key_ops', 'keys', 'kid', 'kty', 'n', 'nbf', 'nonce', 'oth', 'p', 'password', 'q', 'qi', 'role', 'secret', 'sub', 'token', 'use', 'x', 'x5c', 'x5t', 'x5t#S256', 'x5u', 'y', 'zip'}

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received token which might be structured.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_last_word_check module

class credsweeper.filters.value_last_word_check.ValueLastWordCheck(config: Config | None = None)[source]

Bases: Filter

Check that secret is not short value that ends with :.

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_length_check module

class credsweeper.filters.value_length_check.ValueLengthCheck(config: Config | None = None, min_len: int = 4, max_len: int = 8000)[source]

Bases: Filter

Check that candidate value length is between MIN and MAX.

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_method_check module

class credsweeper.filters.value_method_check.ValueMethodCheck(config: Config | None = None)[source]

Bases: Filter

Check if potential candidate value is a function.

Check if potential candidate value is a function by looking for ‘(’, ‘)’ or ‘function’ sub-strings in it

PATTERN = re.compile('^[~.\\->:0-9A-Za-z_]+\$.*\$')

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_morphemes_check module

class credsweeper.filters.value_morphemes_check.ValueMorphemesCheck(config: Config | None = None, threshold: int | None = None)[source]

Bases: Filter

Check value for a threshold of morphemes count

MAX_MORPHEMES_LIMIT = 9

THRESHOLDS_X3 = 13

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_not_allowed_pattern_check module

class credsweeper.filters.value_not_allowed_pattern_check.ValueNotAllowedPatternCheck(config: Config | None = None)[source]

Bases: Filter

Check that secret doesn’t open or closes brackets or a new line.

NOT_ALLOWED = ['[<>\\[\\]{}]\\s+', '\\\\u00(26|3c)gt;?(\\s|\\\\+[nrt])?', '^\\s*\\\\', '^\\s*\\\\n\\s*']

NOT_ALLOWED_PATTERN = re.compile('(?:[<>\\[\\]{}]\\s+|\\\\u00(26|3c)gt;?(\\s|\\\\+[nrt])?|^\\s*\\\\|^\\s*\\\\n\\s*)$', re.IGNORECASE)

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_not_part_encoded_check module

class credsweeper.filters.value_not_part_encoded_check.ValueNotPartEncodedCheck(config: Config | None = None)[source]

Bases: Filter

Check that token is not a part of encoded data.

BASE64_ENCODED_DATA_PATTERN_AFTER = re.compile('(^|[^A-Za-z0-9]+)(?P<val>(([A-Za-z0-9=_-]{4}){4,64})|(([A-Za-z0-9=+/]{4}){4,64}))([^=A-Za-z0-9+/|_-]+|$)')

BASE64_ENCODED_DATA_PATTERN_BEFORE = re.compile('(^|[^A-Za-z0-9]+)(?P<val>(([A-Za-z0-9_-]{4}){16,64})|(([A-Za-z0-9+/]{4}){16,64}))([^=A-Za-z0-9+/|_-]+|$)')

static check_line_target_fit(line_data: LineData, target: AnalysisTarget) → bool[source]: Verifies whether line data fit to be a part of many lines

static check_val(line: str, pattern: Pattern) → bool | None[source]: Verifies whether the line looks like a base64 pattern

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_number_check module

class credsweeper.filters.value_number_check.ValueNumberCheck(config: Config | None = None)[source]

Bases: Filter

Check value if it a value in hex or decimal representation

DEC_VALUE_REGEX = re.compile('^-?[0-9]{1,20}[ul]{0,3}$')

HEX_VALUE_REGEX = re.compile('^(0x)?[0-9a-f]{1,128}[ul]{0,3}$')

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_pattern_check module

class credsweeper.filters.value_pattern_check.ValuePatternCheck(config: Config | None = None, pattern_len: int | None = None)[source]

Bases: Filter

Check if candidate value contain specific pattern.

Similar to linguistic sequences of characters, random strings shouldn’t contain math sequences of characters. Based on “How Bad Can It Git? Characterizing Secret Leakage in Public GitHub Repositories”, details: https://www.ndss-symposium.org/ndss-paper/how-bad-can-it-git-characterizing-secret-leakage-in-public-github-repositories/ PatternCheck checks the occurrence in “line_data.value” of three types of sequence:

N or more identical characters in sequence, example: “AAAA”, “1111” …
N or more increasing characters sequentially, example: “abcd”, “1234” …
N or more decreasing characters sequentially, example: “dcba”, “4321” …

Default pattern LEN is 4

MAX_PATTERN_LENGTH = 13

ascending_pattern_check(value: str, bit_length: int) → bool[source]

Check if candidate value contain 4 and more ascending chars or numbers sequences.

Arg:: value: credential candidate value bit_length: speedup for len(value).bit_length()

Returns:: True if contain and False if not

check_val(value: str, bit_length: int) → bool[source]

Cumulative value check.

Arg:: value: string variable, credential candidate value bit_length: speedup for len(value).bit_length()

Returns:: boolean variable. True if contain and False if not

descending_pattern_check(value: str, bit_length: int) → bool[source]

Check if candidate value contain 4 and more descending chars or numbers sequences.

Arg:: value: string variable, credential candidate value bit_length: speedup for len(value).bit_length()

Returns:: boolean variable. True if contain and False if not

duple_pattern_check(value: str, bit_length: int) → bool[source]

Check if candidate value is a duplet value with possible patterns.

Arg:: value: string variable, credential candidate value bit_length: speedup for len(value).bit_length()

Returns:: boolean variable. True if contain and False if not

equal_pattern_check(value: str, bit_length: int) → bool[source]

Check if candidate value contain 4 and more same chars or numbers sequences.

Parameters:

value – string variable, credential candidate value
bit_length – speedup for len(value).bit_length()

Returns:

True if contain and False if not

static get_pattern(pattern_len: int) → Pattern[source]: Creates regex pattern to find N or more identical characters in sequence

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Arg:: line_data: LineData object, credential candidate data target: multiline target from which line data was obtained

Returns:: boolean variable. True, if need to filter candidate and False if left

credsweeper.filters.value_sealed_secret_check module

class credsweeper.filters.value_sealed_secret_check.ValueSealedSecretCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate may be a sealed secret https://github.com/bitnami-labs/sealed-secrets/blob/main/docs/developer/crypto.md

MAX_SEARCH_MARGIN = 100

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received value and check context for sealed secret markers. Can be applied effective for plain scan when the value is full and the target has lines around.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, when need to filter candidate and False if left

credsweeper.filters.value_search_check module

class credsweeper.filters.value_search_check.ValueSearchCheck(config: Config | None = None, pattern: str | None = None)[source]

Bases: Filter

Check whether a candidate value contains a pattern - useful for multi rules

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_similarity_check module

class credsweeper.filters.value_similarity_check.ValueSimilarityCheck(config: Config | None = None)[source]

Bases: Filter

Check if candidate value is over 75% similarity as candidate variable. Like: secret = “mysecret” (0.8571).

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_split_keyword_check module

class credsweeper.filters.value_split_keyword_check.ValueSplitKeywordCheck(config: Config | None = None)[source]

Bases: Filter

Check value by splitting with standard whitespace separators and any word is not matched in checklist.

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_string_type_check module

class credsweeper.filters.value_string_type_check.ValueStringTypeCheck(config: Config | None = None, check_for_literals=True)[source]

Bases: Filter

Check if line_data is in source code file that require quotes for string declaration.

If it is, then checks if line_data really have string literal declaration. Comment rows in source files (start with //, /*, etc) ignored. Multiple bytes scenario allowed [123,23,54,67,78,89] or {0xae, 0x54, 0x55, 0xff}

True if:

line_data have no value
line_data have no path
line_data is in source code file (.cpp, .py, etc.) and is not comment and contain no quotes (so no string literal declared)

False otherwise

MULTIBYTE_PATTERN = re.compile('((0x)?[0-9a-f]{1,16}[UL]*)(\\s*,\\s*((0x)?[0-9a-f]{1,16}[UL]*)){3}', re.IGNORECASE)

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_token_base32_check module

class credsweeper.filters.value_token_base32_check.ValueTokenBase32Check(config: Config | None = None)[source]

Bases: ValueTokenBaseCheck

Check that candidate have good randomization

RANGE_DICT = {8: ((3.480934, 0.8482364556537906), (1.9280820731422028, 0.5833143826506801)), 10: ((3.4801753333333334, 0.7508676237320747), (1.9558544090983234, 0.5119385414964345)), 15: ((3.4803549285714284, 0.603220270918794), (1.9896690734372564, 0.40640877687972476)), 16: ((3.4798649333333334, 0.5837818960141307), (1.9938368543943692, 0.392547066949958)), 20: ((3.4809878947368422, 0.518785674729997), (2.0058661928593517, 0.34692788889724946)), 24: ((3.480511086956522, 0.4726670109337228), (2.0131379532992537, 0.31476354168931936)), 25: ((3.480877375, 0.4626150412368404), (2.0147828593929953, 0.3075894753390553)), 32: ((3.4809023548387095, 0.4072672632996217), (2.0231609118646867, 0.2700344059876962)), 40: ((3.4801929743589746, 0.36361457820793436), (2.027858606807074, 0.2401498396303172)), 50: ((3.4798551224489795, 0.323708167297437), (2.0318808048208794, 0.2138098551294688)), 64: ((3.4805990476190476, 0.28572156450556774), (2.035756800745673, 0.18815721535870078))}

static get_stat_range(size: int) → Tuple[Tuple[float, float], Tuple[float, float]][source]: Returns minimal, maximal for hop and deviation. Precalculated data is applied for speedup

credsweeper.filters.value_token_base36_check module

class credsweeper.filters.value_token_base36_check.ValueTokenBase36Check(config: Config | None = None)[source]

Bases: ValueTokenBaseCheck

Check that candidate have good randomization

RANGE_DICT = {8: ((3.7190542428571427, 0.8995506118495411), (2.066095086865182, 0.609210293352161)), 10: ((3.719109611111111, 0.7956463384852813), (2.0946299036665494, 0.5322004874842623)), 15: ((3.719274257142857, 0.6401989313894239), (2.129437216268589, 0.42108786288993155)), 16: ((3.7192072666666665, 0.6188627491757901), (2.1336109506109366, 0.4064699817331141)), 20: ((3.719249815789474, 0.5506473627709657), (2.145293932511567, 0.3591543917048417)), 24: ((3.7191934304347827, 0.50051922802262), (2.152858549996053, 0.3252064160191062)), 25: ((3.7192351583333334, 0.4904181410613897), (2.1543202565038735, 0.31823801389315026)), 32: ((3.7190408419354837, 0.4315967526660196), (2.1620321219700767, 0.2788634701820312)), 40: ((3.7191682666666668, 0.3852248727988986), (2.16746680811131, 0.24802261318501675)), 50: ((3.718913744897959, 0.3436564880405547), (2.1715676118603806, 0.22070510537297627)), 64: ((3.7190009761904763, 0.30325954360127116), (2.1751172797904093, 0.1942582237461476))}

static get_stat_range(size: int) → Tuple[Tuple[float, float], Tuple[float, float]][source]: Returns minimal, maximal for hop and deviation. Precalculated data is applied for speedup

credsweeper.filters.value_token_base64_check module

class credsweeper.filters.value_token_base64_check.ValueTokenBase64Check(config: Config | None = None)[source]

Bases: ValueTokenBaseCheck

Check that candidate have good randomization

RANGE_DICT = {8: ((3.7627115714285715, 0.9413431166706269), (2.1378378843992736, 0.6394596814295781)), 10: ((3.7617393333333333, 0.8327986018456262), (2.168873183866972, 0.5605393324056347)), 15: ((3.7619624285714286, 0.6698092646328063), (2.2080058406286702, 0.4447698491992352)), 16: ((3.7618573333333334, 0.6471500119793832), (2.2116826642934453, 0.4288377928263507)), 20: ((3.7618887368421055, 0.575813792926031), (2.224384985667721, 0.37985781543221253)), 24: ((3.7621449565217393, 0.5243297908608613), (2.2326041329976607, 0.34397389723600613)), 25: ((3.762616791666667, 0.5137934920050976), (2.234571917211925, 0.3366547036535176)), 32: ((3.761885838709677, 0.4521158322065318), (2.2426375800006153, 0.29506039075960255)), 40: ((3.7622649487179487, 0.4031261511824518), (2.2485911621253574, 0.2622954601051068)), 50: ((3.762087693877551, 0.3597404118023357), (2.2533774423872956, 0.23384524947332655)), 64: ((3.7625271746031745, 0.31733579704946846), (2.257532519514275, 0.20571908142867643))}

static get_stat_range(size: int) → Tuple[Tuple[float, float], Tuple[float, float]][source]: Returns minimal, maximal for hop and deviation. Precalculated data is applied for speedup

credsweeper.filters.value_token_base_check module

class credsweeper.filters.value_token_base_check.ValueTokenBaseCheck(config: Config | None = None)[source]

Bases: Filter

Check that candidate have good randomization

MUL_DICT = {8: 2.61619746, 10: 2.48685659, 15: 2.34025271, 16: 2.3237029, 20: 2.27614996, 24: 2.24609586, 25: 2.24023515, 32: 2.21025277, 40: 2.18961571, 50: 2.17355282, 64: 2.15981241}

static get_ppf(n: int) → float[source]: Code used to produce the values

abstract static get_stat_range(size: int) → Tuple[Tuple[float, float], Tuple[float, float]][source]: Returns minimal strength. Precalculated data is applied for speedup

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters.value_token_check module

class credsweeper.filters.value_token_check.ValueTokenCheck(config: Config | None = None)[source]

Bases: Filter

Check if first substring of token is shorter than 5.

Split candidate value into substrings using ` ;`{})(<>[]` separators. Check if first substring is shorter than 5

Examples

“my password” “12);password”

SPLIT_PATTERN = re.compile('(?<!\\W) (?!\\W)|[;(){}<>[\\]`]')

run(line_data: LineData, target: AnalysisTarget) → bool[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data – credential candidate data
target – multiline target from which line data was obtained

Returns:

True, if need to filter candidate and False if left

credsweeper.filters package

Subpackages

Submodules

credsweeper.filters.filter module

credsweeper.filters.line_git_binary_check module

credsweeper.filters.line_specific_key_check module

credsweeper.filters.line_uue_part_check module

credsweeper.filters.value_allowlist_check module

credsweeper.filters.value_array_dictionary_check module

credsweeper.filters.value_atlassian_token_check module

credsweeper.filters.value_azure_token_check module

credsweeper.filters.value_base32_data_check module

credsweeper.filters.value_base64_data_check module

credsweeper.filters.value_base64_encoded_pem_check module

credsweeper.filters.value_base64_key_check module

credsweeper.filters.value_base64_part_check module

credsweeper.filters.value_basic_auth_check module

credsweeper.filters.value_blocklist_check module

credsweeper.filters.value_camel_case_check module

credsweeper.filters.value_dictionary_keyword_check module

credsweeper.filters.value_discord_bot_check module

credsweeper.filters.value_entropy_base32_check module

credsweeper.filters.value_entropy_base36_check module

credsweeper.filters.value_entropy_base64_check module

credsweeper.filters.value_entropy_base_check module

credsweeper.filters.value_file_path_check module

credsweeper.filters.value_github_check module

credsweeper.filters.value_grafana_check module

credsweeper.filters.value_grafana_service_check module

credsweeper.filters.value_hex_number_check module

credsweeper.filters.value_jfrog_token_check module

credsweeper.filters.value_json_web_key_check module

credsweeper.filters.value_json_web_token_check module

credsweeper.filters.value_last_word_check module

credsweeper.filters.value_length_check module

credsweeper.filters.value_method_check module

credsweeper.filters.value_morphemes_check module

credsweeper.filters.value_not_allowed_pattern_check module

credsweeper.filters.value_not_part_encoded_check module

credsweeper.filters.value_number_check module

credsweeper.filters.value_pattern_check module

credsweeper.filters.value_sealed_secret_check module

credsweeper.filters.value_search_check module

credsweeper.filters.value_similarity_check module

credsweeper.filters.value_split_keyword_check module

credsweeper.filters.value_string_type_check module

credsweeper.filters.value_token_base32_check module

credsweeper.filters.value_token_base36_check module

credsweeper.filters.value_token_base64_check module

credsweeper.filters.value_token_base_check module

credsweeper.filters.value_token_check module

Module contents