credsweeper.file_handler package
Submodules
credsweeper.file_handler.abstract_provider module
- class credsweeper.file_handler.abstract_provider.AbstractProvider(paths: Sequence[str | Path | BytesIO | Tuple[str | Path, BytesIO]])[source]
Bases:
ABCBase class for all files provider objects.
- abstract get_scannable_files(config: Config) Sequence[DiffContentProvider | TextContentProvider][source]
Get list of file object for analysis based on attribute “paths”.
- Parameters:
config – dict of credsweeper configuration
- Returns:
file objects to analyse
credsweeper.file_handler.analysis_target module
credsweeper.file_handler.byte_content_provider module
credsweeper.file_handler.content_provider module
- class credsweeper.file_handler.content_provider.ContentProvider(file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ABCBase class to provide access to analysis targets for scanned object.
- property descriptor: Descriptor
descriptor getter
credsweeper.file_handler.data_content_provider module
- class credsweeper.file_handler.data_content_provider.DataContentProvider(data: bytes, file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderDummy raw provider to keep bytes
- represent_as_encoded() bool[source]
Encodes data from base64. Stores result in decoded
- Returns:
True if the data correctly parsed and verified
- represent_as_html(depth: int, recursive_limit_size: int, keywords_required_substrings_check: Callable[[str], bool]) bool[source]
Tries to read data as html
- Returns:
True if reading was successful
- represent_as_structure() bool[source]
Tries to convert data with many parsers. Stores result to internal structure Return True if some structure found
credsweeper.file_handler.descriptor module
credsweeper.file_handler.diff_content_provider module
- class credsweeper.file_handler.diff_content_provider.DiffContentProvider(file_path: str, change_type: DiffRowType, diff: List[DiffDict])[source]
Bases:
ContentProviderProvide data from a single .patch file.
- Parameters:
file_path – path to file
change_type – set added or deleted file data to scan
diff –
list of file row changes, with base elements represented as:
{ "old": line number before diff, "new": line number after diff, "line": line text, "hunk": diff hunk number }
- parse_lines_data(lines_data: List[DiffRowData]) Tuple[List[int], List[str]][source]
Parse diff lines data.
- Return list of line numbers with change type “self.change_type” and list of all lines in file
in original order(replaced all lines not mentioned in diff file with blank line)
- Parameters:
lines_data – data of all rows mentioned in diff file
- Returns:
tuple of line numbers with change type “self.change_type” and all file lines in original order(replaced all lines not mentioned in diff file with blank line)
credsweeper.file_handler.file_path_extractor module
- class credsweeper.file_handler.file_path_extractor.FilePathExtractor[source]
Bases:
objectUtil class to browse files in directories
- static apply_gitignore(detected_files: List[str]) List[str][source]
Apply gitignore rules for each file.
- Parameters:
detected_files – list of files to be checked
- Returns:
List of files with all files ignored by git removed
- static check_exclude_file(config: Config, path: str) bool[source]
Checks whether file should be excluded
- Parameters:
config – Config
path – str - full path preferred
- Returns:
True when the file full path should be excluded according config
- static check_file_size(config: Config, reference: str | Path | BytesIO | Tuple[str | Path, BytesIO]) bool[source]
Checks whether the file is over the size limit from configuration
- Parameters:
config – Config
reference – various types of a file reference
- Returns:
True when the file is oversize
- static get_file_paths(config: Config, path: str | Path) List[str][source]
Get all files in the directory. Automatically exclude files non-code or data files (such as .jpg).
- Parameters:
config – credsweeper configuration
path – path to the file or directory to be scanned
- Returns:
List all non-excluded files in the directory
- static is_find_by_ext_file(config: Config, extension: str) bool[source]
Checks whether file has suspicious extension
- Parameters:
config – Config
extension – str - may be only file name with extension
- Returns:
True when the feature is configured and the file extension matches
credsweeper.file_handler.files_provider module
- class credsweeper.file_handler.files_provider.FilesProvider(paths: Sequence[str | Path | BytesIO | Tuple[str | Path, BytesIO]], skip_ignored: bool | None = None)[source]
Bases:
AbstractProviderProvider of plain os files to be analysed.
- get_scannable_files(config: Config) Sequence[DiffContentProvider | TextContentProvider][source]
Get list of full text file object for analysis of files with parent paths from “paths”.
- Parameters:
config – dict of credsweeper configuration
- Returns:
preprocessed file objects for analysis
credsweeper.file_handler.patches_provider module
- class credsweeper.file_handler.patches_provider.PatchesProvider(paths: Sequence[str | Path | BytesIO | Tuple[str | Path, BytesIO]], change_type: DiffRowType)[source]
Bases:
AbstractProviderProvide data from a list of .patch files.
- get_files_sequence(raw_patches: List[List[str]]) Sequence[DiffContentProvider | TextContentProvider][source]
Returns sequence of files
- get_scannable_files(config: Config) Sequence[DiffContentProvider | TextContentProvider][source]
Get files to scan. Output based on the paths field.
- Parameters:
config – dict of credsweeper configuration
- Returns:
file objects for analysing
credsweeper.file_handler.string_content_provider module
credsweeper.file_handler.struct_content_provider module
- class credsweeper.file_handler.struct_content_provider.StructContentProvider(struct: Any, file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderContent provider to keep structured data
credsweeper.file_handler.text_content_provider module
- class credsweeper.file_handler.text_content_provider.TextContentProvider(file_path: str | Path | Tuple[str | Path, BytesIO], file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderProvide access to analysis targets for full-text file scanning.
- Parameters:
file_path – string, path to file
Module contents
- class credsweeper.file_handler.ByteContentProvider(content: bytes, file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderAllow to scan byte sequence instead of extra reading a file
- class credsweeper.file_handler.ContentProvider(file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ABCBase class to provide access to analysis targets for scanned object.
- property descriptor: Descriptor
descriptor getter
- class credsweeper.file_handler.DataContentProvider(data: bytes, file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderDummy raw provider to keep bytes
- represent_as_encoded() bool[source]
Encodes data from base64. Stores result in decoded
- Returns:
True if the data correctly parsed and verified
- represent_as_html(depth: int, recursive_limit_size: int, keywords_required_substrings_check: Callable[[str], bool]) bool[source]
Tries to read data as html
- Returns:
True if reading was successful
- represent_as_structure() bool[source]
Tries to convert data with many parsers. Stores result to internal structure Return True if some structure found
- class credsweeper.file_handler.DiffContentProvider(file_path: str, change_type: DiffRowType, diff: List[DiffDict])[source]
Bases:
ContentProviderProvide data from a single .patch file.
- Parameters:
file_path – path to file
change_type – set added or deleted file data to scan
diff –
list of file row changes, with base elements represented as:
{ "old": line number before diff, "new": line number after diff, "line": line text, "hunk": diff hunk number }
- parse_lines_data(lines_data: List[DiffRowData]) Tuple[List[int], List[str]][source]
Parse diff lines data.
- Return list of line numbers with change type “self.change_type” and list of all lines in file
in original order(replaced all lines not mentioned in diff file with blank line)
- Parameters:
lines_data – data of all rows mentioned in diff file
- Returns:
tuple of line numbers with change type “self.change_type” and all file lines in original order(replaced all lines not mentioned in diff file with blank line)
- class credsweeper.file_handler.StringContentProvider(lines: List[str], line_numbers: List[int] | None = None, file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderProvider performs scan simple text lines