Credsweeper package¶
CredSweeper¶
- class credsweeper.app.CredSweeper(rule_path=None, config_path=None, api_validation=False, json_filename=None, xlsx_filename=None, use_filters=True, pool_count=1, ml_batch_size=16, ml_threshold=ThresholdPreset.medium, find_by_ext=False, depth=0, size_limit=None, exclude_lines=None, exclude_values=None)[source]¶
Bases:
object
Advanced credential analyzer base class.
- Parameters:
credential_manager – CredSweeper credential manager object
scanner – CredSweeper scanner object
pool_count (
int
) – number of pools used to run multiprocessing scanningconfig – dictionary variable, stores analyzer features
json_filename (
Optional
[str
]) – string variable, credential candidates export filename
- class MlValidator(threshold)¶
Bases:
object
ML validation class
- extract_common_features(candidates)¶
Extract features that are guaranteed to be the same for all candidates on the same line with same value.
- Return type:
- extract_unique_features(candidates)¶
Extract features that can by different between candidates. Join them with or operator.
- Return type:
- get_group_features(value, candidates)¶
np.newaxis used to add new dimension if front, so input will be treated as a batch
- validate_groups(group_list, batch_size)¶
Use ml model on list of candidate groups.
- data_scan(data_provider, depth, recursive_limit_size)[source]¶
Recursive function to scan files which might be containers like ZIP archives
- Parameters:
data_provider (
DataContentProvider
) – DataContentProvider object may be a containerdepth (
int
) – maximal level of recursionrecursive_limit_size (
int
) – maximal bytes of opened files to prevent recursive zip-bomb attack
- Return type:
- export_results()[source]¶
Save credential candidates to json file or print them to a console.
- Return type:
- file_scan(content_provider)[source]¶
Run scanning of file from ‘file_provider’.
- Parameters:
content_provider (
ContentProvider
) – content provider object to scan- Return type:
- Returns:
list of credential candidates from scanned file
- property ml_validator: MlValidator¶
ml_validator getter
- Return type:
- post_processing()[source]¶
Machine learning validation for received credential candidates.
- Return type:
- run(content_provider)[source]¶
Run an analysis of ‘content_provider’ object.
- Parameters:
content_provider (
FilesProvider
) – path objects to scan- Return type:
- scan(content_providers)[source]¶
Run scanning of files from an argument “content_providers”.
- Parameters:
content_providers (
Union
[List
[DiffContentProvider
],List
[TextContentProvider
]]) – file objects to scan- Return type: