Credsweeper package¶
CredSweeper¶
- class credsweeper.app.CredSweeper(rule_path=None, config_path=None, api_validation=False, json_filename=None, xlsx_filename=None, sort_output=False, use_filters=True, pool_count=1, ml_batch_size=16, ml_threshold=ThresholdPreset.medium, azure=False, cuda=False, find_by_ext=False, depth=0, doc=False, severity=Severity.INFO, size_limit=None, exclude_lines=None, exclude_values=None, log_level=None)[source]¶
Bases:
object
Advanced credential analyzer base class.
- Parameters:
credential_manager – CredSweeper credential manager object
scanner – CredSweeper scanner object
pool_count (
int
) – number of pools used to run multiprocessing scanningconfig – dictionary variable, stores analyzer features
json_filename (
Union
[None
,str
,Path
]) – string variable, credential candidates export filename
- class MlValidator(threshold, azure=False, cuda=False)¶
Bases:
object
ML validation class
- extract_common_features(candidates)¶
Extract features that are guaranteed to be the same for all candidates on the same line with same value.
- Return type:
- extract_unique_features(candidates)¶
Extract features that can be different between candidates. Join them with or operator.
- Return type:
- get_group_features(value, candidates)¶
np.newaxis used to add new dimension if front, so input will be treated as a batch
- validate_groups(group_list, batch_size)¶
Use ml model on list of candidate groups.
- export_results()[source]¶
Save credential candidates to json file or print them to a console.
- Return type:
- file_scan(content_provider)[source]¶
Run scanning of file from ‘file_provider’.
- Parameters:
content_provider (
Union
[DiffContentProvider
,TextContentProvider
]) – content provider object to scan- Return type:
- Returns:
list of credential candidates from scanned file
- property ml_validator: MlValidator¶
ml_validator getter
- post_processing()[source]¶
Machine learning validation for received credential candidates.
- Return type:
- run(content_provider)[source]¶
Run an analysis of ‘content_provider’ object.
- Parameters:
content_provider (
AbstractProvider
) – path objects to scan- Return type:
- scan(content_providers)[source]¶
Run scanning of files from an argument “content_providers”.
- Parameters:
content_providers (
Sequence
[Union
[DiffContentProvider
,TextContentProvider
]]) – file objects to scan- Return type: