credsweeper.file_handler package
Submodules
credsweeper.file_handler.abstract_provider module
- class credsweeper.file_handler.abstract_provider.AbstractProvider(paths: Sequence[str | Path | BytesIO | Tuple[str | Path, BytesIO]])[source]
Bases:
ABCBase class for all files provider objects.
- abstract get_scannable_files(config: Config) Sequence[ContentProvider][source]
Get list of file object for analysis based on attribute “paths”.
- Parameters:
config – dict of credsweeper configuration
- Returns:
file objects to analyse
credsweeper.file_handler.analysis_target module
credsweeper.file_handler.byte_content_provider module
credsweeper.file_handler.content_provider module
- class credsweeper.file_handler.content_provider.ContentProvider(file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ABCBase class to provide access to analysis targets for scanned object.
- property descriptor: Descriptor
descriptor getter
credsweeper.file_handler.data_content_provider module
- class credsweeper.file_handler.data_content_provider.DataContentProvider(data: bytes, file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderDummy raw provider to keep bytes
- property data: bytes | None
data RO getter for DataContentProvider and the property is used in deep scan
- represent_as_html(depth: int, recursive_limit_size: int, keywords_required_substrings_check: Callable[[str], bool]) bool | None[source]
Tries to read data as html
- Returns:
True if reading was successful False if no data found None if the format is not acceptable
- represent_as_structure() bool | None[source]
Tries to convert data with many parsers. Stores result to internal structure
- Returns:
True if some structure found False if no data found None if the format is not acceptable
- represent_as_xml() bool | None[source]
Tries to read data as xml
- Returns:
True if reading was successful False if no data found None if the format is not acceptable
- static simple_html_representation(html: BeautifulSoup) Tuple[List[int], List[str], int][source]
simple parse as it is displayed to user and appends the lines
credsweeper.file_handler.descriptor module
credsweeper.file_handler.diff_content_provider module
- class credsweeper.file_handler.diff_content_provider.DiffContentProvider(file_path: str, change_type: DiffRowType, diff: List[DiffDict])[source]
Bases:
ContentProviderProvide data from a single .patch file.
- Parameters:
file_path – path to file
change_type – set added or deleted file data to scan
diff –
list of file row changes, with base elements represented as:
{ "old": line number before diff, "new": line number after diff, "line": line text, "hunk": diff hunk number }
- static parse_lines_data(change_type: DiffRowType, lines_data: List[DiffRowData]) Tuple[List[int], List[str]][source]
Parse diff lines data.
- Return list of line numbers with change type “self.change_type” and list of all lines in file
in original order(replaced all lines not mentioned in diff file with blank line)
- Parameters:
change_type – set added or deleted file data to scan
lines_data – data of all rows mentioned in diff file
- Returns:
tuple of line numbers with change type “self.change_type” and all file lines in original order(replaced all lines not mentioned in diff file with blank line)
- static patch2files_diff(raw_patch: List[str], change_type: DiffRowType) Dict[str, List[DiffDict]][source]
Generate files changes from patch for added or deleted filepaths.
- Parameters:
raw_patch – git patch file content
change_type – change type to select, DiffRowType.ADDED or DiffRowType.DELETED
- Returns:
return dict with
{file paths: list of file row changes}, where elements of list of file row changes represented as:{ "old": line number before diff, "new": line number after diff, "line": line text, "hunk": diff hunk number }
- static preprocess_diff_rows(added_line_number: int | None, deleted_line_number: int | None, line: str) List[DiffRowData][source]
Auxiliary function to extend diff changes.
- Parameters:
added_line_number – number of added line or None
deleted_line_number – number of deleted line or None
line – the text line
- Returns:
diff rows data with as list of row change type, line number, row content
- static preprocess_file_diff(changes: List[DiffDict]) List[DiffRowData][source]
Generate changed file rows from diff data with changed lines (e.g. marked + or - in diff).
- Parameters:
changes – git diff by file rows data
- Returns:
diff rows data with as list of row change type, line number, row content
credsweeper.file_handler.file_path_extractor module
- class credsweeper.file_handler.file_path_extractor.FilePathExtractor[source]
Bases:
objectUtil class to browse files in directories
- FIND_BY_EXT_RULE = 'Suspicious File Extension'
- static apply_gitignore(detected_files: List[str]) List[str][source]
Apply gitignore rules for each file.
- Parameters:
detected_files – list of files to be checked
- Returns:
List of files with all files ignored by git removed
- static check_exclude_file(config: Config, path: str) bool[source]
Checks whether file should be excluded
- Parameters:
config – Config
path – str - full path preferred
- Returns:
True when the file full path should be excluded according config
- static check_file_size(config: Config, reference: str | Path | BytesIO | Tuple[str | Path, BytesIO]) bool[source]
Checks whether the file is over the size limit from configuration or less MIN_DATA_LEN
- Parameters:
config – Config
reference – various types of a file reference
- Returns:
True when the file is oversize or less than MIN_DATA_LEN, or unsupported
- static get_file_paths(config: Config, path: str | Path) List[str][source]
Get all files in the directory. Automatically exclude files non-code or data files (such as .jpg).
- Parameters:
config – credsweeper configuration
path – path to the file or directory to be scanned
- Returns:
List all non-excluded files in the directory
- static is_find_by_ext_file(config: Config, extension: str) bool[source]
Checks whether file has suspicious extension
- Parameters:
config – Config
extension – str - may be only file name with extension
- Returns:
True when the feature is configured and the file extension matches
credsweeper.file_handler.files_provider module
- class credsweeper.file_handler.files_provider.FilesProvider(paths: Sequence[str | Path | BytesIO | Tuple[str | Path, BytesIO]], skip_ignored: bool | None = None)[source]
Bases:
AbstractProviderProvider of plain os files to be analysed.
- get_scannable_files(config: Config) Sequence[ContentProvider][source]
Get list of full text file object for analysis of files with parent paths from “paths”.
- Parameters:
config – dict of credsweeper configuration
- Returns:
preprocessed file objects for analysis
credsweeper.file_handler.patches_provider module
- class credsweeper.file_handler.patches_provider.PatchesProvider(paths: Sequence[str | Path | BytesIO | Tuple[str | Path, BytesIO]], change_type: DiffRowType)[source]
Bases:
AbstractProviderProvide data from a list of .patch files.
- get_files_sequence(raw_patches: List[List[str]]) Sequence[ContentProvider][source]
Returns sequence of files
- get_scannable_files(config: Config) Sequence[ContentProvider][source]
Get files to scan. Output based on the paths field.
- Parameters:
config – dict of credsweeper configuration
- Returns:
file objects for analysing
credsweeper.file_handler.string_content_provider module
credsweeper.file_handler.struct_content_provider module
- class credsweeper.file_handler.struct_content_provider.StructContentProvider(struct: Any, file_path: str | None = None, file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderContent provider to keep structured data
credsweeper.file_handler.text_content_provider module
- class credsweeper.file_handler.text_content_provider.TextContentProvider(file_path: str | Path | Tuple[str | Path, BytesIO], file_type: str | None = None, info: str | None = None)[source]
Bases:
ContentProviderProvide access to analysis targets for full-text file scanning.
- Parameters:
file_path – string, path to file