credsweeper.deep_scanner package
Submodules
credsweeper.deep_scanner.abstract_scanner module
- class credsweeper.deep_scanner.abstract_scanner.AbstractScanner[source]
Bases:
ABCBase abstract class for all recursive scanners
- abstract data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Abstract method to be defined in DeepScanner
- deep_scan_with_fallback(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate][source]
Scans with deep scanners and fallback scanners if possible
- Parameters:
data_provider – DataContentProvider with raw data
depth – maximal level of recursion
recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack
Returns: list with candidates
- abstract static get_deep_scanners(data: bytes, descriptor: Descriptor, depth: int, limit: int) Tuple[List[Any], List[Any]][source]
Returns possibly scan methods for the data depends on content and fallback scanners
- static key_value_combination(structure: dict) Generator[Tuple[Any, Any], None, None][source]
Combine items by key and value from a dictionary for augmentation {…, “key”: “api_key”, “value”: “XXXXXXX”, …} -> (“api_key”, “XXXXXXX”)
- static read_compressed_with_limit(file: LZMAFile | GzipFile | BZ2File, limit: int) bytes[source]
Reads data with check limit for single compressed file
- recursive_scan(data_provider: DataContentProvider, depth: int = 0, recursive_limit_size: int = 0) List[Candidate][source]
Recursive function to scan files which might be containers like ZIP archives
- Parameters:
data_provider – DataContentProvider object may be a container
depth – maximal level of recursion
recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack
- scan(content_provider: ContentProvider, depth: int, recursive_limit_size: int | None = None) List[Candidate][source]
Initial scan method to launch recursive scan. Skips ByteScanner to prevent extra scan
- Parameters:
content_provider – ContentProvider that might contain raw data
depth – maximal level of recursion
recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack
- static structure_processing(structure: Any) Generator[Tuple[Any, Any], None, None][source]
Yields pair key, value from given structure if applicable
- structure_scan(struct_provider: StructContentProvider, depth: int, recursive_limit_size: int) List[Candidate][source]
Recursive function to scan structured data
- Parameters:
struct_provider – DataContentProvider object may be a container
depth – maximal level of recursion
recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack
credsweeper.deep_scanner.byte_scanner module
- class credsweeper.deep_scanner.byte_scanner.ByteScanner[source]
Bases:
AbstractScanner,ABCImplements plain data scanning
credsweeper.deep_scanner.bzip2_scanner module
credsweeper.deep_scanner.crx_scanner module
credsweeper.deep_scanner.csv_scanner module
- class credsweeper.deep_scanner.csv_scanner.CsvScanner[source]
Bases:
AbstractScanner,ABCImplements CSV scanning
- CSV_PATTERN = re.compile(b'[^\r\n]{1,8000}[,;\t|\x1f][^\r\n]{1,8000}')
- DELIMITERS = ',;\t|\x1f'
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Tries to scan each row as structure with column name in key
- classmethod get_structure(text: str) List[Dict[str, Any]][source]
Reads a text as CSV standard with guessed dialect
- sniffer = <csv.Sniffer object>
credsweeper.deep_scanner.deb_scanner module
- class credsweeper.deep_scanner.deb_scanner.DebScanner[source]
Bases:
AbstractScanner,ABCImplements deb (ar) scanning
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Extracts data file from .ar (debian) archive and launches data_scan
credsweeper.deep_scanner.deep_scanner module
- class credsweeper.deep_scanner.deep_scanner.DeepScanner(config: Config, scanner: Scanner)[source]
Bases:
ByteScanner,Bzip2Scanner,CrxScanner,CsvScanner,DocxScanner,EncoderScanner,GzipScanner,HtmlScanner,JclassScanner,JksScanner,LangScanner,LzmaScanner,MxfileScanner,EmlScanner,OdsScanner,PatchScanner,PdfScanner,PkcsScanner,PngScanner,PptxScanner,ProtobufScanner,RtfScanner,RpmScanner,Sqlite3Scanner,StringsScanner,TarScanner,DebScanner,XmlScanner,XlsScanner,XlsxScanner,ZipScanner,ZlibScannerAdvanced scanner with recursive exploring of data
- MEDIA_PATTERNS: Dict[int, List[Tuple[bytes, Pattern]]] = {0: [(b'\x00\x00\x00\x0cjP \r\n\x87\n', None), (b'\x00\x00\x01\x00', None), (b'\x00\x01\x00\x00\x00', None), (b'\x00\x00\x00', re.compile(b'\x00\x00\x00.ftyp3g')), (b'\x00GITCRYPT\x00', None)], 26: [(b'\x1aE\xdf\xa3', None)], 56: [(b'8BPS\x00\x01\x00\x00\x00\x00\x00\x00', None), (b'8BPS\x00\x02\x00\x00\x00\x00\x00\x00', None)], 66: [(b'BM', re.compile(b'BM.{2}\x00{4}'))], 71: [(b'GIF8', re.compile(b'GIF8[79]a[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 73: [(b'II', re.compile(b'II[+*]\x00[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]')), (b'ID3\x03\x00\x00\x00', None)], 77: [(b'MM', re.compile(b'MM\x00[+*][^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 79: [(b'OggS', re.compile(b'OggS[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]')), (b'OTTO\x00', re.compile(b'OTTO\x00[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 82: [(b'RIF', re.compile(b'RIF[FX].{4}[ 0-9A-Za-z]{4}[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 88: [(b'XFIR', re.compile(b'XFIR.{4}[ 0-9A-Za-z]{4}[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 102: [(b'ftyp', re.compile(b'ftyp(isom|MSNV)[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 103: [(b'gimp xcf', re.compile(b'gimp xcf (file|v001|v002)\x00[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 119: [(b'wOF', re.compile(b'wOF[2F][^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 127: [(b'\x7fELF', re.compile(b'\x7fELF[\x01\x02][\x01\x02]\x01[\x00-\x12]'))], 137: [(b'\x89PNG\r\n\x1a\n', None)], 255: [(b'\xff', re.compile(b'\xff(\xd8\xff[\xdb\xee\xe1\xe0Q]|[\xfb\xf3\xf2])'))]}
- static get_deep_scanners(data: bytes, descriptor: Descriptor, depth: int, limit: int) Tuple[List[Any], List[Any]][source]
Returns possibly scan methods for the data depends on content and fallback scanners
credsweeper.deep_scanner.docx_scanner module
credsweeper.deep_scanner.eml_scanner module
credsweeper.deep_scanner.encoder_scanner module
- class credsweeper.deep_scanner.encoder_scanner.EncoderScanner[source]
Bases:
AbstractScanner,ABCImplements recursive iteration when data might be encoded from base64
- BASE64_PATTERN = re.compile(b'(\\xFF\\xFE|\\xFE\\xFF)?((?:(?P<a>[A-Z])|(?P<b>[a-z])|(?P<c>[0-9/+])|[\\s\\x00\\\\])+(?(a)(?(b)(?(c)(=+|$)|(?!x)x)|(?!x)x)|(?!x)x)|(?:(?P<e>[A-Z])|(?P<f>[a-z])|(?P<g>[0-9_-])|[\\s\\x00\\\\])+(?(e)(?)
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Tries to decode data from base64 encode to bytes and scan as bytes again
credsweeper.deep_scanner.gzip_scanner module
credsweeper.deep_scanner.html_scanner module
- class credsweeper.deep_scanner.html_scanner.HtmlScanner[source]
Bases:
AbstractScanner,ABCImplements html scanning if possible
credsweeper.deep_scanner.jclass_scanner module
- class credsweeper.deep_scanner.jclass_scanner.JclassScanner[source]
Bases:
AbstractScanner,ABCImplements java .class scanning
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Extracts data from binary
- static get_utf8_constants(stream: BytesIO) List[str][source]
Extracts only Utf8 constants from java ClassFile
credsweeper.deep_scanner.jks_scanner module
credsweeper.deep_scanner.lang_scanner module
- class credsweeper.deep_scanner.lang_scanner.LangScanner[source]
Bases:
AbstractScanner,ABCImplements scanning of data if it is a script of some markup language
credsweeper.deep_scanner.lzma_scanner module
credsweeper.deep_scanner.mxfile_scanner module
credsweeper.deep_scanner.ods_scanner module
credsweeper.deep_scanner.pandas_scanner module
credsweeper.deep_scanner.patch_scanner module
credsweeper.deep_scanner.pdf_scanner module
- class credsweeper.deep_scanner.pdf_scanner.PdfScanner[source]
Bases:
AbstractScanner,ABCImplements pdf scanning
credsweeper.deep_scanner.pkcs_scanner module
credsweeper.deep_scanner.png_scanner module
- class credsweeper.deep_scanner.png_scanner.PngScanner[source]
Bases:
AbstractScanner,ABCImplements PNG scanning for text chunks
credsweeper.deep_scanner.pptx_scanner module
credsweeper.deep_scanner.protobuf_scanner module
- class credsweeper.deep_scanner.protobuf_scanner.ProtobufScanner[source]
Bases:
AbstractScanner,ABCImplements protobuf (ar) scanning
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Extracts data file from protobuf payload and launches data_scan
- static match_protobuf(data: bytes | bytearray, offset: int, limit: int) bool[source]
Process data from start to end as simple protobuf chunk
Returns: True when whole chunk was utilized with protobuf structure
- static read_varint(data: bytes | bytearray, offset: int) Tuple[int, int][source]
Reads varint from offset up to 64 bit values (10 bytes)
Returns: used bytes (-1 when overflow), the value
credsweeper.deep_scanner.rpm_scanner module
- class credsweeper.deep_scanner.rpm_scanner.RpmScanner[source]
Bases:
AbstractScanner,ABCImplements rpm scanning
credsweeper.deep_scanner.rtf_scanner module
credsweeper.deep_scanner.sqlite3_scanner module
- class credsweeper.deep_scanner.sqlite3_scanner.Sqlite3Scanner[source]
Bases:
AbstractScanner,ABCImplements SQLite3 database scanning
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Extracts data file from .ar (debian) archive and launches data_scan
credsweeper.deep_scanner.strings_scanner module
- class credsweeper.deep_scanner.strings_scanner.StringsScanner[source]
Bases:
AbstractScanner,ABCImplements known binary file scanning with ASCII strings representations
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Scan binary files for ASCII strings
credsweeper.deep_scanner.tar_scanner module
credsweeper.deep_scanner.tmx_scanner module
- class credsweeper.deep_scanner.tmx_scanner.TmxScanner[source]
Bases:
AbstractScanner,ABCRealises tmX files scanning for values only. Image tags are skipped.
credsweeper.deep_scanner.xls_scanner module
- class credsweeper.deep_scanner.xls_scanner.XlsScanner[source]
Bases:
PandasScanner,ABCImplements xls matching
credsweeper.deep_scanner.xlsx_scanner module
credsweeper.deep_scanner.xml_scanner module
- class credsweeper.deep_scanner.xml_scanner.XmlScanner[source]
Bases:
AbstractScanner,ABCRealises xml scanning
- XML_FIRST_BRACKET_PATTERN = re.compile(b'^\\s*<')
- XML_OPENING_TAG_PATTERN = re.compile(b'<([0-9A-Za-z_]{1,256})')
credsweeper.deep_scanner.zip_scanner module
- class credsweeper.deep_scanner.zip_scanner.ZipScanner[source]
Bases:
AbstractScanner,ABCImplements zip scanning
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Extracts files one by one from zip archives and launches data_scan
credsweeper.deep_scanner.zlib_scanner module
- class credsweeper.deep_scanner.zlib_scanner.ZlibScanner[source]
Bases:
AbstractScanner,ABCImplements zlib data inflate and scan
- data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]
Inflate data from zlib compressed and launches data_scan