credsweeper.credentials package
Submodules
credsweeper.credentials.augment_candidates module
- credsweeper.credentials.augment_candidates.augment_candidates(candidates: List[Candidate], new_candidates: List[Candidate])[source]
Augments candidates with new_candidates if value of line data is not present in the candidates
- Parameters:
candidates – [IN/OUT] list of candidates to be augmented
new_candidates – [IN] list with new candidates
credsweeper.credentials.candidate module
- class credsweeper.credentials.candidate.Candidate(line_data_list: List[LineData], patterns: List[Pattern], rule_name: str, severity: Severity, config: Config | None = None, use_ml: bool = False, confidence: Confidence = Confidence.MODERATE)[source]
Bases:
objectCandidates that can be credentials.
Class contains list of LineData, some attributes from Rule object, and config
- Parameters:
line_data_list – List of LineData
patterns – Regular expressions that can be used for detection
rule_name – Name of Rule
severity – critical/high/medium/low
confidence – strong/moderate/weak
config – user configs
use_ml – Whether the candidate should be validated with ML. If not - ml_probability is set None
- DUMMY_PATTERN = re.compile('^')
- classmethod get_dummy_candidate(config: Config, file_path: str, file_type: str, info: str, rule_name: str)[source]
Create dummy instance to use in searching file by extension
- to_dict_list(hashed: bool, subtext: bool) List[dict][source]
Convert credential candidate object to List[dict].
- Returns:
List[dict] object generated from current credential candidate
credsweeper.credentials.candidate_group_generator module
credsweeper.credentials.candidate_key module
credsweeper.credentials.credential_manager module
- class credsweeper.credentials.credential_manager.CredentialManager[source]
Bases:
objectThe manager allows you to store, add and delete separate credit candidates.
- add_credential(candidate: Candidate) None[source]
Add credential candidate to the manager.
- Parameters:
candidate – credential candidate to be added
- get_credentials() List[Candidate][source]
Get all credential candidates stored in the manager.
- Returns:
List with all Candidate objects stored in manager
- group_credentials() CandidateGroupGenerator[source]
Join candidates that reference same secret value in the same line.
Candidate can belong to two groups in the same time if it has more than one LineData object inside
- Returns:
Contain dictionary of [path, line_num, value] -> credential candidates list
- len_credentials() int[source]
Get number of credential candidates stored in the manager.
- Returns:
Non-negative integer
- purge_duplicates() int[source]
Purge duplicates candidates which may appear in overlaps during long line scan.
Returns: number of removed duplicates
credsweeper.credentials.line_data module
- class credsweeper.credentials.line_data.LineData(config: Config, line: str, line_pos: int, line_num: int, path: str, file_type: str, info: str, pattern: Pattern, match_obj: Match | None = None)[source]
Bases:
objectObject to treat and store scanned line related data.
- Parameters:
key – Optional[str] = None
line – string variable, line
line_num – int variable, number of line in file
path – string variable, path to file
file_type – string variable, extension of file ‘.txt’
info – additional info about how the data was detected
pattern – regex pattern, detected pattern in line
separator – optional string variable, separators between variable and value
separator_start – optional variable, separator position start
value – optional string variable, detected value in line
variable – optional string variable, detected variable in line
- EXCEPTION_POSITION = -2
- INITIAL_WRONG_POSITION = -3
- bash_param_split = re.compile('\\s+(\\-|\\||\\>|\\w+?\\>|\\&)')
- clean_bash_parameters() None[source]
Split variable and value by bash special characters, if line assumed to be CLI command.
- clean_tag_parameters() None[source]
Remove closing tag from value if the opened is somewhere before in line
- clean_toml_parameters() None[source]
Parenthesis, curly and squared brackets may be caught in TOML format and bash. Simple clearing
- clean_url_parameters() None[source]
Clean url address from ‘query parameters’.
If line seem to be a URL - split by & character. Variable should be right most value after & or ? ([-1]). And value should be left most before & ([0])
- comment_starts = ('//', '* ', '# ', '/*', '<!––', '%{', '%', '...', '(*', '--', '--[[', '#=')
- compare(other: LineData) bool[source]
Comparison method - skip whole line and checks only when variable and value are the same
- get_colored_line(hashed: bool, subtext: bool = False) str[source]
Represents the LineData with a value, separator, and variable color formatting
- static get_hash_or_subtext(text: str | None, hashed: bool, cut_pos: StartEnd | None = None) str | None[source]
Represent not empty text with hash or a “beauty” subtext if required
- Parameters:
text – str - input string
hashed – bool - whether the text will be hashed and returned
cut_pos – Optional[StartEnd] - start, end positions which text must be kept in output
- Returns:
sha256 hash in hex representation of input text with UTF-8 encodings or subtext from start to end, or original text as is
- initialize(match_obj: Match | None = None) None[source]
Apply regex to the candidate line and set internal fields based on match.
- is_comment() bool[source]
Check if line with credential is a comment.
- Returns:
True if line is a comment, False otherwise
- property is_quoted: bool
Check if variable and value in a quoted string.
- Returns:
True if candidate in a quoted string, False otherwise
- is_source_file() bool[source]
Check if file with credential is a source code file or not (data, log, plain text).
- Returns:
True if file is source file, False otherwise
- is_source_file_with_quotes() bool[source]
Check if file with credential require quotation for string literals.
- Returns:
True if file require quotation, False otherwise
- property is_well_quoted_value: bool
Well quoted value - means the value has been quoted or has line wrap
- line_endings = re.compile('\\\\{1,8}[nr]')
- quotation_marks = ('"', "'", '`')
- sanitize_variable() None[source]
Remove trailing spaces, dashes and quotations around the variable. Correct position.
- to_json(hashed: bool, subtext: bool) Dict[source]
Convert line data object to dictionary.
- Returns:
Dictionary object generated from current line data
- to_str(subtext: bool = False, hashed: bool = False) str[source]
Represent line_data with subtext or|and hashed values
- url_chars_not_allowed_pattern = re.compile('[\\s"<>\\[\\]^~`{|}]')
- url_percent_split = re.compile('%(21|23|24|26|27|28|29|2a|2b|2c|2f|3a|3b|3d|3f|40|5b|5d)', re.IGNORECASE)
- url_scheme_part_regex = re.compile('[0-9A-Za-z.-]{3}')
- url_unicode_split = re.compile('\\\\u00(0000)?(21|23|24|26|27|28|29|2a|2b|2c|2f|3a|3b|3d|3f|40|5b|5d)', re.IGNORECASE)
- url_value_pattern = re.compile('[^\\s&;"<>\\[\\]^~`{|}]+[&;][^\\s=;"<>\\[\\]^~`{|}]{3,80}=[^\\s;&="<>\\[\\]^~`{|}]{1,80}')
- variable_strip_pattern = ' \t\n\r\x0b\x0c,\'"-;'