filters package

Subpackages

Submodules

filters.filter module

class credsweeper.filters.filter.Filter[source]

Bases: object

Base class for all filters that operates on ‘line_data’ objects.

abstract run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.line_specific_key_check module

class credsweeper.filters.line_specific_key_check.LineSpecificKeyCheck[source]

Bases: Filter

Check that values from list below is not in candidate line.

NOT_ALLOWED = ['example', 'enc\\(', 'enc\\[', 'true', 'false']
NOT_ALLOWED_PATTERN = regex.Regex('(?:example|enc\\(|enc\\[|true|false)', flags=regex.I | regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.separator_unusual_check module

class credsweeper.filters.separator_unusual_check.SeparatorUnusualCheck[source]

Bases: Filter

Check that candidate have no double symbol ops (like ++, –, <<) or comparison ops (like != or ==) as separator.

Example

pwd == ‘value’ pwd != ‘value’ pwd << value

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_allowlist_check module

class credsweeper.filters.value_allowlist_check.ValueAllowlistCheck[source]

Bases: Filter

Check that patterns from the list is not present in the candidate value.

ALLOWED = ['ENC\\(.*\\)', 'ENC\\[.*\\]', '\\$\\{.*\\}', '#\\{.*\\}', '\\{\\{.+\\}\\}', '(\\w|\\d|\\.|->)+\\(.*\\)', '\\*\\*\\*\\*\\*']
ALLOWED_PATTERN = regex.Regex('(?:ENC\\(.*\\)|ENC\\[.*\\]|\\$\\{.*\\}|#\\{.*\\}|\\{\\{.+\\}\\}|(\\w|\\d|\\.|->)+\\(.*\\)|\\*\\*\\*\\*\\*)', flags=regex.I | regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_array_dictionary_check module

class credsweeper.filters.value_array_dictionary_check.ValueArrayDictionaryCheck[source]

Bases: Filter

Match call to dictionary or array element.

This filter checks only calls, not declarations:

token = values[i] would be filtered token = {‘root’} would be kept

PATTERN = regex.Regex('\\[(\'|")?.+(\'|")?\\]', flags=regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_blocklist_check module

class credsweeper.filters.value_blocklist_check.ValueBlocklistCheck[source]

Bases: Filter

Check that words from block list is lest that 70% of candidate value length.

NOT_ALLOWED = ['true', 'false', 'null', 'bearer', 'string']
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_camel_case_check module

class credsweeper.filters.value_camel_case_check.ValueCamelCaseCheck[source]

Bases: Filter

Check that candidate is not written in camel case.

CAMEL_CASE = ['^([a-z]+([A-Z][a-z]+)+)$', '^([A-Z][a-z]+([A-Z][a-z]+)+)$']
CAMEL_CASE_PATTERN = regex.Regex('(?:^([a-z]+([A-Z][a-z]+)+)$|^([A-Z][a-z]+([A-Z][a-z]+)+)$)', flags=regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_dictionary_keyword_check module

class credsweeper.filters.value_dictionary_keyword_check.ValueDictionaryKeywordCheck[source]

Bases: Filter

Check that no word from dictionary present in the candidate value.

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_dictionary_value_length_check module

class credsweeper.filters.value_dictionary_value_length_check.ValueDictionaryValueLengthCheck[source]

Bases: Filter

Check that candidate length is between 5 and 30.

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_entropy_check module

class credsweeper.filters.value_entropy_check.ValueEntropyCheck[source]

Bases: Filter

Check that candidate have Shanon Entropy > 3 (for HEX_CHARS or BASE36_CHARS) or > 4.5 (for BASE64_CHARS).

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_file_path_check module

class credsweeper.filters.value_file_path_check.ValueFilePathCheck[source]

Bases: Filter

Check that candidate value is a path or not.

Check if a value contains either ‘/’ or ‘:' separators (but not both) and do not have any special characters ( !$`&*()+)

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_first_word_check module

class credsweeper.filters.value_first_word_check.ValueFirstWordCheck[source]

Bases: Filter

Check that secret doesn’t starts with special character.

NOT_ALLOWED = ['\\=', '\\{', '\\)', '\\<', '\\>', '\\#', '\\:', '\\\\', '\\/\\/', '\\_', '\\\\[u]', '\\/\\*', '\\%[deflspuvxz]']
NOT_ALLOWED_PATTERN = regex.Regex('^(?:\\=|\\{|\\)|\\<|\\>|\\#|\\:|\\\\|\\/\\/|\\_|\\\\[u]|\\/\\*|\\%[deflspuvxz])', flags=regex.I | regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_last_word_check module

class credsweeper.filters.value_last_word_check.ValueLastWordCheck[source]

Bases: Filter

Check that secret is not short value that ends with :.

NOT_ALLOWED_COLON_PATTERN = regex.Regex('.*:$', flags=regex.I | regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_length_check module

class credsweeper.filters.value_length_check.ValueLengthCheck(min_len)[source]

Bases: Filter

Check if potential candidate value is not too short (longer or equal to min_len).

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_method_check module

class credsweeper.filters.value_method_check.ValueMethodCheck[source]

Bases: Filter

Check if potential candidate value is a function.

Check if potential candidate value is a function by looking for ‘(’, ‘)’ or ‘function’ sub-strings in it

PATTERN = regex.Regex('.*\\(.*\\).*', flags=regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_not_allowed_pattern module

class credsweeper.filters.value_not_allowed_pattern.ValueNotAllowedPatternCheck[source]

Bases: Filter

Check that secret doesn’t open or closes brackets or a new line.

NOT_ALLOWED = ['[,<>{};\\]\\[](\\s)*', '(\\s)+[\\\\]', '(\\\\n)(\\s)*']
NOT_ALLOWED_PATTERN = regex.Regex('(?:[,<>{};\\]\\[](\\s)*|(\\s)+[\\\\]|(\\\\n)(\\s)*)$', flags=regex.I | regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_pattern_filter module

class credsweeper.filters.value_pattern_filter.ValuePatternCheck(pattern_len=4)[source]

Bases: Filter

Check if candidate value contain specific pattern.

Similar to linguistic sequences of characters, random strings shouldn’t contain math sequences of characters. Based on “How Bad Can It Git? Characterizing Secret Leakage in Public GitHub Repositories”, details: https://www.ndss-symposium.org/ndss-paper/how-bad-can-it-git-characterizing-secret-leakage-in-public-github-repositories/ PatternCheck checks the occurrence in “line_data.value” of three types of sequence:

  • N or more identical characters in sequence, example: “AAAA”, “1111” …

  • N or more increasing characters sequentially, example: “abcd”, “1234” …

  • N or more decreasing characters sequentially, example: “dcba”, “4321” …

Default pattern LEN is 4

ascending_pattern_check(line_data_value)[source]

Check if candidate value contain 4 and more ascending chars or numbers sequences.

Arg:

line_data_value: credential candidate value

Return type:

bool

Returns:

True if contain and False if not

descending_pattern_check(line_data_value)[source]

Check if candidate value contain 4 and more descending chars or numbers sequences.

Arg:

line_data_value: string variable, credential candidate value

Return type:

bool

Returns:

boolean variable. True if contain and False if not

equal_pattern_check(line_data_value)[source]

Check if candidate value contain 4 and more same chars or numbers sequences.

Parameters:

line_data_value (str) – string variable, credential candidate value

Return type:

bool

Returns:

True if contain and False if not

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Arg:

line_data: LineData object, credential candidate data

Return type:

bool

Returns:

boolean variable. True, if need to filter candidate and False if left

filters.value_similarity_check module

class credsweeper.filters.value_similarity_check.ValueSimilarityCheck[source]

Bases: Filter

Check if candidate value is at least 70% same as candidate keyword. Like: secret = “mysecret”.

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_string_type_check module

class credsweeper.filters.value_string_type_check.ValueStringTypeCheck(config)[source]

Bases: Filter

Check if line_data is in source code file that require quotes for string declaration.

If it is, then checks if line_data really have string literal declaration. Comment rows in source files (start with //, /*, etc) ignored.

True if:

  • line_data have no value

  • line_data have no path

  • line_data is in source code file (.cpp, .py, etc.) and is not comment and contain no quotes (so no string literal declared)

False otherwise

run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_token_check module

class credsweeper.filters.value_token_check.ValueTokenCheck[source]

Bases: Filter

Check if first substring of token is shorter than 5.

Split candidate value into substrings using ` ;`{})(<>[]` separators. Check if first substring is shorter than 5

Examples

“my password” “12);password”

SPLIT_PATTERN = ' |;|\\)|\\(|{|}|<|>|\\[|\\]|`'
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.value_useless_word_check module

class credsweeper.filters.value_useless_word_check.ValueUselessWordCheck[source]

Bases: Filter

Check is candidate value contains sub-rows with operators (like ->).

NOT_ALLOWED = ['((\\{)?(0x)+([0-9a-f]|\\%){1}.*)', '(\\-\\>.*)', '(xxxx.*)', '(\\s).*']
NOT_ALLOWED_PATTERN = regex.Regex('(?:((\\{)?(0x)+([0-9a-f]|\\%){1}.*)|(\\-\\>.*)|(xxxx.*)|(\\s).*)', flags=regex.I | regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

filters.variable_check module

class credsweeper.filters.variable_check.VariableCheck[source]

Bases: Filter

Check if candidate variable is a regex placeholder or ends with match character (like + or >).

NOT_ALLOWED = ['^([<]|\\{\\{).*', '(\\@.*)', '[!><+*/^|)](\\s)?$']
NOT_ALLOWED_PATTERN = regex.Regex('(?:^([<]|\\{\\{).*|(\\@.*)|[!><+*/^|)](\\s)?$)', flags=regex.I | regex.V0)
run(line_data)[source]

Run filter checks on received credential candidate data ‘line_data’.

Parameters:

line_data (LineData) – credential candidate data

Return type:

bool

Returns:

True, if need to filter candidate and False if left

Module contents