credsweeper.credentials package¶
Submodules¶
credsweeper.credentials.augment_candidates module¶
credsweeper.credentials.candidate module¶
- class credsweeper.credentials.candidate.Candidate(line_data_list, patterns, rule_name, severity, config=None, validations=None, use_ml=False, confidence=Confidence.MODERATE)[source]¶
Bases:
objectCandidates that can be credentials.
Class contains list of LineData, some attributes from Rule object, and config
- Parameters:
patterns (
List[Pattern]) – Regular expressions that can be used for detectionrule_name (
str) – Name of Ruleseverity (
Severity) – critical/high/medium/lowconfidence (
Confidence) – strong/moderate/weakvalidations (
Optional[List[Validation]]) – List of Validation objects that can check this credential using external APIuse_ml (
bool) – Should ML work on this credential or not. If not prediction based on regular expression and filter only
- classmethod get_dummy_candidate(config, file_path, file_type, info)[source]¶
Create dummy instance to use in searching file by extension
- is_api_validation_available()[source]¶
Check if current credential candidate can be validated with external API.
- Return type:
- Returns:
True if any validation available, False otherwise
credsweeper.credentials.candidate_group_generator module¶
credsweeper.credentials.candidate_key module¶
credsweeper.credentials.credential_manager module¶
- class credsweeper.credentials.credential_manager.CredentialManager[source]¶
Bases:
objectThe manager allows you to store, add and delete separate credit candidates.
- Parameters:
candidates – list of credential candidates
- group_credentials()[source]¶
Join candidates that reference same secret value in the same line.
Candidate can belong to two groups in the same time if it has more than one LineData object inside
- Return type:
- Returns:
Contain dictionary of [path, line_num, value] -> credential candidates list
- purge_duplicates()[source]¶
Purge duplicates candidates which may appear in overlaps during long line scan.
Returns: number of removed duplicates
- Return type:
credsweeper.credentials.line_data module¶
- class credsweeper.credentials.line_data.LineData(config, line, line_pos, line_num, path, file_type, info, pattern, match_obj=None)[source]¶
Bases:
objectObject to treat and store scanned line related data.
- Parameters:
key – Optional[str] = None
line (
str) – string variable, lineline_num (
int) – int variable, number of line in filepath (
str) – string variable, path to filefile_type (
str) – string variable, extension of file ‘.txt’info (
str) – additional info about how the data was detectedpattern (
Pattern) – regex pattern, detected pattern in lineseparator – optional string variable, separators between variable and value
separator_start – optional variable, separator position start
value – optional string variable, detected value in line
variable – optional string variable, detected variable in line
- EXCEPTION_POSITION = -2¶
- INITIAL_WRONG_POSITION = -3¶
- bash_param_split = re.compile('\\s+(\\-|\\||\\>|\\w+?\\>|\\&)')¶
- clean_bash_parameters()[source]¶
Split variable and value by bash special characters, if line assumed to be CLI command.
- Return type:
- clean_toml_parameters()[source]¶
Parenthesis, curly and squared brackets may be caught in TOML format and bash. Simple clearing
- Return type:
- clean_url_parameters()[source]¶
Clean url address from ‘query parameters’.
If line seem to be a URL - split by & character. Variable should be right most value after & or ? ([-1]). And value should be left most before & ([0])
- Return type:
- comment_starts = ('//', '* ', '#', '/*', '<!––', '%{', '%', '...', '(*', '--', '--[[', '#=')¶
- compare(other)[source]¶
Comparison method - skip whole line and checks only when variable and value are the same
- Return type:
- static get_hash_or_subtext(text, hashed, cut_pos=None)[source]¶
Represent not empty text with hash or a “beauty” subtext if required
- Parameters:
- Return type:
- Returns:
sha256 hash in hex representation of input text with UTF-8 encodings or subtext from start to end, or original text as is
- initialize(match_obj=None)[source]¶
Apply regex to the candidate line and set internal fields based on match.
- Return type:
- is_comment()[source]¶
Check if line with credential is a comment.
- Return type:
- Returns:
True if line is a comment, False otherwise
- property is_quoted: bool¶
Check if variable and value in a quoted string.
- Returns:
True if candidate in a quoted string, False otherwise
- is_source_file()[source]¶
Check if file with credential is a source code file or not (data, log, plain text).
- Return type:
- Returns:
True if file is source file, False otherwise
- is_source_file_with_quotes()[source]¶
Check if file with credential require quotation for string literals.
- Return type:
- Returns:
True if file require quotation, False otherwise
- property is_well_quoted_value: bool¶
Well quoted value - means the value has been quoted or has line wrap
- line_endings = re.compile('\\\\{1,8}[nr]')¶
- quotation_marks = ('"', "'", '`')¶
- sanitize_variable()[source]¶
Remove trailing spaces, dashes and quotations around the variable. Correct position.
- Return type:
- to_json(hashed, subtext)[source]¶
Convert line data object to dictionary.
- Return type:
- Returns:
Dictionary object generated from current line data
- to_str(subtext=False, hashed=False)[source]¶
Represent line_data with subtext or|and hashed values
- Return type:
- url_chars_not_allowed_pattern = re.compile('[\\s"<>\\[\\]^~`{|}]')¶
- url_param_split = re.compile('(%|\\\\u(00){0,2})(26|3f)', re.IGNORECASE)¶
- url_scheme_part_regex = re.compile('[0-9A-Za-z.-]{3}')¶
- url_value_pattern = re.compile('[^\\s&;"<>\\[\\]^~`{|}]+[&;][^\\s=;"<>\\[\\]^~`{|}]{3,80}=[^\\s;&="<>\\[\\]^~`{|}]{1,80}')¶
- variable_strip_pattern = ' \t\n\r\x0b\x0c,\'"-;'¶