credsweeper.credentials package¶

Submodules¶

credsweeper.credentials.augment_candidates module¶

credsweeper.credentials.augment_candidates.augment_candidates(candidates, new_candidates)[source]¶

Augments candidates with new_candidates if value of line data is not present in the candidates

Parameters:

candidates (List[Candidate]) – [IN/OUT] list of candidates to be augmented
new_candidates (List[Candidate]) – [IN] list with new candidates

credsweeper.credentials.candidate module¶

class credsweeper.credentials.candidate.Candidate(line_data_list, patterns, rule_name, severity, config=None, validations=None, use_ml=False, confidence=Confidence.MODERATE)[source]¶

Bases: object

Candidates that can be credentials.

Class contains list of LineData, some attributes from Rule object, and config

Parameters:

line_data_list (List[LineData]) – List of LineData
patterns (List[Pattern]) – Regular expressions that can be used for detection
rule_name (str) – Name of Rule
severity (Severity) – critical/high/medium/low
confidence (Confidence) – strong/moderate/weak
config (Optional[Config]) – user configs
validations (Optional[List[Validation]]) – List of Validation objects that can check this credential using external API
use_ml (bool) – Should ML work on this credential or not. If not prediction based on regular expression and filter only

compare(other)[source]¶

Comparison method - checks only result of final cred

Return type:: bool

classmethod get_dummy_candidate(config, file_path, file_type, info)[source]¶: Create dummy instance to use in searching file by extension

is_api_validation_available()[source]¶

Check if current credential candidate can be validated with external API.

Return type:: bool
Returns:: True if any validation available, False otherwise

to_dict_list(hashed, subtext)[source]¶

Convert credential candidate object to List[dict].

Return type:: List[dict]
Returns:: List[dict] object generated from current credential candidate

to_json(hashed, subtext)[source]¶

Convert credential candidate object to dictionary.

Return type:: Dict
Returns:: Dictionary object generated from current credential candidate

to_str(subtext=False, hashed=False)[source]¶

Represent candidate with subtext or|and hashed values

Return type:: str

credsweeper.credentials.candidate_group_generator module¶

class credsweeper.credentials.candidate_group_generator.CandidateGroupGenerator[source]¶

Bases: object

property grouped_candidates: Dict[CandidateKey, List[Candidate]]¶: property getter

items()[source]¶

getter

Return type:: List[Tuple[CandidateKey, List[Candidate]]]

credsweeper.credentials.candidate_key module¶

class credsweeper.credentials.candidate_key.CandidateKey(line_data)[source]¶

Bases: object

Class used to identify credential candidates.

Candidates that detected same value on same string in a same file would have identical CandidateKey

credsweeper.credentials.credential_manager module¶

class credsweeper.credentials.credential_manager.CredentialManager[source]¶

Bases: object

The manager allows you to store, add and delete separate credit candidates.

Parameters:: candidates – list of credential candidates

add_credential(candidate)[source]¶

Add credential candidate to the manager.

Parameters:: candidate (Candidate) – credential candidate to be added
Return type:: None

get_credentials()[source]¶

Get all credential candidates stored in the manager.

Return type:: List[Candidate]
Returns:: List with all Candidate objects stored in manager

group_credentials()[source]¶

Join candidates that reference same secret value in the same line.

Candidate can belong to two groups in the same time if it has more than one LineData object inside

Return type:: CandidateGroupGenerator
Returns:: Contain dictionary of [path, line_num, value] -> credential candidates list

purge_duplicates()[source]¶

Purge duplicates candidates which may appear in overlaps during long line scan.

Returns: number of removed duplicates

Return type:: int

remove_credential(candidate)[source]¶

Remove credential candidate from the manager.

Parameters:: candidate (Candidate) – credential candidate to be removed
Return type:: None

set_credentials(candidates)[source]¶

Remove all current credentials candidates from the manager and add new credentials.

Parameters:: candidates (List[Candidate]) – List with candidates to replace current candidates in the manager
Return type:: None

credsweeper.credentials.line_data module¶

class credsweeper.credentials.line_data.LineData(config, line, line_pos, line_num, path, file_type, info, pattern, match_obj=None)[source]¶

Bases: object

Object to treat and store scanned line related data.

Parameters:

key – Optional[str] = None
line (str) – string variable, line
line_num (int) – int variable, number of line in file
path (str) – string variable, path to file
file_type (str) – string variable, extension of file ‘.txt’
info (str) – additional info about how the data was detected
pattern (Pattern) – regex pattern, detected pattern in line
separator – optional string variable, separators between variable and value
separator_start – optional variable, separator position start
value – optional string variable, detected value in line
variable – optional string variable, detected variable in line

EXCEPTION_POSITION = -2¶

INITIAL_WRONG_POSITION = -3¶

bash_param_split = re.compile('\\s+(\\-|\\||\\>|\\w+?\\>|\\&)')¶

check_url_part()[source]¶

Determines whether value is part of url like line

Return type:: bool

clean_bash_parameters()[source]¶

Split variable and value by bash special characters, if line assumed to be CLI command.

Return type:: None

clean_toml_parameters()[source]¶

Parenthesis, curly and squared brackets may be caught in TOML format and bash. Simple clearing

Return type:: None

clean_url_parameters()[source]¶

Clean url address from ‘query parameters’.

If line seem to be a URL - split by & character. Variable should be right most value after & or ? ([-1]). And value should be left most before & ([0])

Return type:: None

comment_starts = ('//', '* ', '#', '/*', '<!––', '%{', '%', '...', '(*', '--', '--[[', '#=')¶

compare(other)[source]¶

Comparison method - skip whole line and checks only when variable and value are the same

Return type:: bool

static get_hash_or_subtext(text, hashed, cut_pos=None)[source]¶

Represent not empty text with hash or a “beauty” subtext if required

Parameters:

text (Optional[str]) – str - input string
hashed (bool) – bool - whether the text will be hashed and returned
cut_pos (Optional[StartEnd]) – Optional[StartEnd] - start, end positions which text must be kept in output

Return type:

Optional[str]

Returns:

sha256 hash in hex representation of input text with UTF-8 encodings or subtext from start to end, or original text as is

initialize(match_obj=None)[source]¶

Apply regex to the candidate line and set internal fields based on match.

Return type:: None

is_comment()[source]¶

Check if line with credential is a comment.

Return type:: bool
Returns:: True if line is a comment, False otherwise

property is_quoted: bool¶

Check if variable and value in a quoted string.

Returns:: True if candidate in a quoted string, False otherwise

is_source_file()[source]¶

Check if file with credential is a source code file or not (data, log, plain text).

Return type:: bool
Returns:: True if file is source file, False otherwise

is_source_file_with_quotes()[source]¶

Check if file with credential require quotation for string literals.

Return type:: bool
Returns:: True if file require quotation, False otherwise

property is_well_quoted_value: bool¶: Well quoted value - means the value has been quoted or has line wrap

line_endings = re.compile('\\\\{1,8}[nr]')¶

quotation_marks = ('"', "'", '`')¶

sanitize_value()[source]¶: Clean found value from extra artifacts. Correct positions if changed.

sanitize_variable()[source]¶

Remove trailing spaces, dashes and quotations around the variable. Correct position.

Return type:: None

to_json(hashed, subtext)[source]¶

Convert line data object to dictionary.

Return type:: Dict
Returns:: Dictionary object generated from current line data

to_str(subtext=False, hashed=False)[source]¶

Represent line_data with subtext or|and hashed values

Return type:: str

url_chars_not_allowed_pattern = re.compile('[\\s"<>\\[\\]^~`{|}]')¶

url_param_split = re.compile('(%|\\\\u(00){0,2})(26|3f)', re.IGNORECASE)¶

url_scheme_part_regex = re.compile('[0-9A-Za-z.-]{3}')¶

url_value_pattern = re.compile('[^\\s&;"<>\\[\\]^~`{|}]+[&;][^\\s=;"<>\\[\\]^~`{|}]{3,80}=[^\\s;&="<>\\[\\]^~`{|}]{1,80}')¶

variable_strip_pattern = ' \t\n\r\x0b\x0c,\'"-;'¶

credsweeper.credentials package¶

Submodules¶

credsweeper.credentials.augment_candidates module¶

credsweeper.credentials.candidate module¶

credsweeper.credentials.candidate_group_generator module¶

credsweeper.credentials.candidate_key module¶

credsweeper.credentials.credential_manager module¶

credsweeper.credentials.line_data module¶

Module contents¶