credsweeper.deep_scanner package

Submodules

credsweeper.deep_scanner.abstract_scanner module

class credsweeper.deep_scanner.abstract_scanner.AbstractScanner[source]

Bases: ABC

Base abstract class for all recursive scanners

exception LimitError[source]

Bases: Exception

Decompressed data exceeds configured limit

abstract property config: Config: Abstract property to be defined in DeepScanner

abstract data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Abstract method to be defined in DeepScanner

deep_scan_with_fallback(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate][source]

Scans with deep scanners and fallback scanners if possible

Parameters:

data_provider – DataContentProvider with raw data
depth – maximal level of recursion
recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack

Returns: list with candidates

abstract static get_deep_scanners(data: bytes, descriptor: Descriptor, depth: int, limit: int) → Tuple[List[Any], List[Any]][source]: Returns possibly scan methods for the data depends on content and fallback scanners

static key_value_combination(structure: dict) → Generator[Tuple[Any, Any], None, None][source]: Combine items by key and value from a dictionary for augmentation {…, “key”: “api_key”, “value”: “XXXXXXX”, …} -> (“api_key”, “XXXXXXX”)

abstract static match(data: bytes | bytearray) → bool[source]: Abstract method for any deep scanner

static read_compressed_with_limit(file: LZMAFile | GzipFile | BZ2File, limit: int) → bytes[source]: Reads data with check limit for single compressed file

recursive_scan(data_provider: DataContentProvider, depth: int = 0, recursive_limit_size: int = 0) → List[Candidate][source]

Recursive function to scan files which might be containers like ZIP archives

Parameters:

data_provider – DataContentProvider object may be a container
depth – maximal level of recursion
recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack

scan(content_provider: ContentProvider, depth: int, recursive_limit_size: int | None = None) → List[Candidate][source]

Initial scan method to launch recursive scan. Skips ByteScanner to prevent extra scan

Parameters:

content_provider – ContentProvider that might contain raw data
depth – maximal level of recursion
recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack

abstract property scanner: Scanner: Abstract property to be defined in DeepScanner

static structure_processing(structure: Any) → Generator[Tuple[Any, Any], None, None][source]: Yields pair key, value from given structure if applicable

structure_scan(struct_provider: StructContentProvider, depth: int, recursive_limit_size: int) → List[Candidate][source]

Recursive function to scan structured data

Parameters:

struct_provider – DataContentProvider object may be a container
depth – maximal level of recursion
recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack

static structure_size(structure: Any) → int[source]: Calculates approximated size of structure data

credsweeper.deep_scanner.byte_scanner module

class credsweeper.deep_scanner.byte_scanner.ByteScanner[source]

Bases: AbstractScanner, ABC

Implements plain data scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to represent data as plain text with splitting by lines and scan as text lines

static match(data: bytes | bytearray) → bool[source]: Match for any

credsweeper.deep_scanner.bzip2_scanner module

class credsweeper.deep_scanner.bzip2_scanner.Bzip2Scanner[source]

Bases: AbstractScanner, ABC

Implements bzip2 scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts data from bzip2 archive and launches data_scan

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/Bzip2

credsweeper.deep_scanner.crx_scanner module

class credsweeper.deep_scanner.crx_scanner.CrxScanner[source]

Bases: AbstractScanner, ABC

Implements CRX files scanning with cut-off prefix

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries cut-off header and use ZIP payload

static match(data: bytes | bytearray) → bool[source]: Returns True if prefix match

static zip_extract(data: bytes) → bytes[source]: Extracts zip payload after signature block

credsweeper.deep_scanner.csv_scanner module

class credsweeper.deep_scanner.csv_scanner.CsvScanner[source]

Bases: AbstractScanner, ABC

Implements CSV scanning

CSV_PATTERN = re.compile(b'[^\r\n]{1,8000}[,;\t|\x1f][^\r\n]{1,8000}')

DELIMITERS = ',;\t|\x1f'

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan each row as structure with column name in key

classmethod get_structure(text: str) → List[Dict[str, Any]][source]: Reads a text as CSV standard with guessed dialect

static match(data: bytes | bytearray) → bool[source]: Check if data MAY be in CSV format

sniffer = <csv.Sniffer object>

credsweeper.deep_scanner.deb_scanner module

class credsweeper.deep_scanner.deb_scanner.DebScanner[source]

Bases: AbstractScanner, ABC

Implements deb (ar) scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts data file from .ar (debian) archive and launches data_scan

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/Deb_(file_format)

static walk_deb(data: bytes) → Generator[Tuple[int, str, bytes], None, None][source]: Processes sequence of DEB archive and yields offset, name and data

credsweeper.deep_scanner.deep_scanner module

class credsweeper.deep_scanner.deep_scanner.DeepScanner(config: Config, scanner: Scanner)[source]

Bases: ByteScanner, Bzip2Scanner, CrxScanner, CsvScanner, DocxScanner, EncoderScanner, GzipScanner, HtmlScanner, JclassScanner, JksScanner, LangScanner, LzmaScanner, MxfileScanner, EmlScanner, OdsScanner, PatchScanner, PdfScanner, PkcsScanner, PngScanner, PptxScanner, ProtobufScanner, RtfScanner, RpmScanner, Sqlite3Scanner, StringsScanner, TarScanner, DebScanner, XmlScanner, XlsScanner, XlsxScanner, ZipScanner, ZlibScanner

Advanced scanner with recursive exploring of data

MEDIA_PATTERNS: Dict[int, List[Tuple[bytes, Pattern]]] = {0: [(b'\x00\x00\x00\x0cjP \r\n\x87\n', None), (b'\x00\x00\x01\x00', None), (b'\x00\x01\x00\x00\x00', None), (b'\x00\x00\x00', re.compile(b'\x00\x00\x00.ftyp3g')), (b'\x00GITCRYPT\x00', None)], 26: [(b'\x1aE\xdf\xa3', None)], 56: [(b'8BPS\x00\x01\x00\x00\x00\x00\x00\x00', None), (b'8BPS\x00\x02\x00\x00\x00\x00\x00\x00', None)], 66: [(b'BM', re.compile(b'BM.{2}\x00{4}'))], 71: [(b'GIF8', re.compile(b'GIF8[79]a[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 73: [(b'II', re.compile(b'II[+*]\x00[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]')), (b'ID3\x03\x00\x00\x00', None)], 77: [(b'MM', re.compile(b'MM\x00[+*][^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 79: [(b'OggS', re.compile(b'OggS[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]')), (b'OTTO\x00', re.compile(b'OTTO\x00[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 82: [(b'RIF', re.compile(b'RIF[FX].{4}[ 0-9A-Za-z]{4}[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 88: [(b'XFIR', re.compile(b'XFIR.{4}[ 0-9A-Za-z]{4}[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 102: [(b'ftyp', re.compile(b'ftyp(isom|MSNV)[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 103: [(b'gimp xcf', re.compile(b'gimp xcf (file|v001|v002)\x00[^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 119: [(b'wOF', re.compile(b'wOF[2F][^\x00-\x08\x0c\x0e\x1f\x80-\xff]{0,4096}[\x00-\x08\x0c\x0e\x1f\x80-\xff]'))], 127: [(b'\x7fELF', re.compile(b'\x7fELF[\x01\x02][\x01\x02]\x01[\x00-\x12]'))], 137: [(b'\x89PNG\r\n\x1a\n', None)], 255: [(b'\xff', re.compile(b'\xff(\xd8\xff[\xdb\xee\xe1\xe0Q]|[\xfb\xf3\xf2])'))]}

property config: Config: Abstract property to be defined in DeepScanner

static get_deep_scanners(data: bytes, descriptor: Descriptor, depth: int, limit: int) → Tuple[List[Any], List[Any]][source]: Returns possibly scan methods for the data depends on content and fallback scanners

static is_media(data: bytes | bytearray) → bool[source]: Returns True if well-known media format found

property scanner: Scanner: Abstract property to be defined in DeepScanner

credsweeper.deep_scanner.docx_scanner module

class credsweeper.deep_scanner.docx_scanner.DocxScanner[source]

Bases: AbstractScanner, ABC

Implements docx scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan DOCX text with splitting by lines

static match(data: bytes | bytearray) → bool[source]: Assume, ZIP prefix and common office files were checked before

credsweeper.deep_scanner.eml_scanner module

class credsweeper.deep_scanner.eml_scanner.EmlScanner[source]

Bases: AbstractScanner, ABC

Implements eml scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan EML with text representation

static match(data: bytes | bytearray) → bool[source]: According to https://datatracker.ietf.org/doc/html/rfc822 lookup the fields: Date, From, To or Subject

credsweeper.deep_scanner.encoder_scanner module

class credsweeper.deep_scanner.encoder_scanner.EncoderScanner[source]

Bases: AbstractScanner, ABC

Implements recursive iteration when data might be encoded from base64

BASE64_PATTERN = re.compile(b'(\\xFF\\xFE|\\xFE\\xFF)?((?:(?P<a>[A-Z])|(?P<b>[a-z])|(?P<c>[0-9/+])|[\\s\\x00\\\\])+(?(a)(?(b)(?(c)(=+|$)|(?!x)x)|(?!x)x)|(?!x)x)|(?:(?P<e>[A-Z])|(?P<f>[a-z])|(?P<g>[0-9_-])|[\\s\\x00\\\\])+(?(e)(?)

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to decode data from base64 encode to bytes and scan as bytes again

static decode(text: str) → bytes | None[source]: Decodes base64 text with cleaning whitespaces. Returns None when the decoding fails

static match(data: bytes | bytearray) → bool[source]: Check if data MAY be base64 encoded with whitespaces (escaping too)

credsweeper.deep_scanner.gzip_scanner module

class credsweeper.deep_scanner.gzip_scanner.GzipScanner[source]

Bases: AbstractScanner, ABC

Realises gzip scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts data from gzip archive and launches data_scan

static match(data: bytes | bytearray) → bool[source]: According https://www.rfc-editor.org/rfc/rfc1952

credsweeper.deep_scanner.html_scanner module

class credsweeper.deep_scanner.html_scanner.HtmlScanner[source]

Bases: AbstractScanner, ABC

Implements html scanning if possible

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to represent data as html text and scan as text lines

static match(data: bytes | bytearray) → bool[source]: Used to detect html format. Suppose, invocation of is_xml() was True before.

credsweeper.deep_scanner.jclass_scanner module

class credsweeper.deep_scanner.jclass_scanner.JclassScanner[source]

Bases: AbstractScanner, ABC

Implements java .class scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts data from binary

static get_utf8_constants(stream: BytesIO) → List[str][source]: Extracts only Utf8 constants from java ClassFile

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures - java class

static u2(stream: BytesIO) → int[source]: Extracts unsigned 16 bit big-endian

credsweeper.deep_scanner.jks_scanner module

class credsweeper.deep_scanner.jks_scanner.JksScanner[source]

Bases: AbstractScanner, ABC

Implements jks scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan JKS to open with standard password

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures - jks

credsweeper.deep_scanner.lang_scanner module

class credsweeper.deep_scanner.lang_scanner.LangScanner[source]

Bases: AbstractScanner, ABC

Implements scanning of data if it is a script of some markup language

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to represent data as markup language and scan as structure

static match(data: bytes | bytearray) → bool[source]: Applied in represent_as_structure

credsweeper.deep_scanner.lzma_scanner module

class credsweeper.deep_scanner.lzma_scanner.LzmaScanner[source]

Bases: AbstractScanner, ABC

Implements lzma scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts data from lzma archive and launches data_scan

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures - lzma also xz

credsweeper.deep_scanner.mxfile_scanner module

class credsweeper.deep_scanner.mxfile_scanner.MxfileScanner[source]

Bases: AbstractScanner, ABC

Scanner for drawio diagram

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to get text data from the xml format

static match(data: bytes | bytearray) → bool[source]: Used to detect mxfile (drawio) format. Suppose, invocation of is_xml() was True before.

credsweeper.deep_scanner.ods_scanner module

class credsweeper.deep_scanner.ods_scanner.OdsScanner[source]

Bases: PandasScanner, ABC

Implements xlsx scanning

static match(data: bytes | bytearray) → bool[source]: Assume, ZIP prefix was checked before

credsweeper.deep_scanner.pandas_scanner module

class credsweeper.deep_scanner.pandas_scanner.PandasScanner[source]

Bases: AbstractScanner, ABC

Implements xlsx scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan xlsx text elements for all slides

credsweeper.deep_scanner.patch_scanner module

class credsweeper.deep_scanner.patch_scanner.PatchScanner[source]

Bases: AbstractScanner, ABC

Implements .patch scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan EML with text representation

static match(data: bytes | bytearray) → bool[source]: Match logic in data_scan

credsweeper.deep_scanner.pdf_scanner module

class credsweeper.deep_scanner.pdf_scanner.PdfScanner[source]

Bases: AbstractScanner, ABC

Implements pdf scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan PDF elements recursively and the whole text on page as strings

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures - pdf

credsweeper.deep_scanner.pkcs_scanner module

class credsweeper.deep_scanner.pkcs_scanner.PkcsScanner[source]

Bases: AbstractScanner, ABC

Implements pkcs12 scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan PKCS12 to open with standard password

static match(data: bytes | bytearray) → bool[source]: Matched ASN1 structure

credsweeper.deep_scanner.png_scanner module

class credsweeper.deep_scanner.png_scanner.PngScanner[source]

Bases: AbstractScanner, ABC

Implements PNG scanning for text chunks

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan each row as structure with column name in key

static match(data: bytes | bytearray) → bool[source]: Returns True if prefix match

static yield_png_chunks(data: bytes) → Generator[Tuple[int, str, bytes], None, None][source]: Processes PNG chunks and yields offset, type and data

credsweeper.deep_scanner.pptx_scanner module

class credsweeper.deep_scanner.pptx_scanner.PptxScanner[source]

Bases: AbstractScanner, ABC

Implements pptx scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to scan pptx text elements for all slides

static match(data: bytes | bytearray) → bool[source]: Assume, ZIP prefix and common office files were checked before

credsweeper.deep_scanner.protobuf_scanner module

class credsweeper.deep_scanner.protobuf_scanner.ProtobufScanner[source]

Bases: AbstractScanner, ABC

Implements protobuf (ar) scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts data file from protobuf payload and launches data_scan

static match(data: bytes | bytearray) → bool[source]: Simple structure check for whole data

static match_protobuf(data: bytes | bytearray, offset: int, limit: int) → bool[source]

Process data from start to end as simple protobuf chunk

Returns: True when whole chunk was utilized with protobuf structure

static read_varint(data: bytes | bytearray, offset: int) → Tuple[int, int][source]

Reads varint from offset up to 64 bit values (10 bytes)

Returns: used bytes (-1 when overflow), the value

static read_wire(data: bytes | bytearray, offset: int) → Tuple[int, int][source]

Reads wire to detect sizes

Returns: size of wire type (with primitives types), size of data (length-delimited)

static walk_protobuf(data: bytes, offset: int, limit: int) → Generator[Tuple[int, bytes], None, None][source]: Processes sequence of protobuf and yields offset and data recursive

credsweeper.deep_scanner.rpm_scanner module

class credsweeper.deep_scanner.rpm_scanner.RpmScanner[source]

Bases: AbstractScanner, ABC

Implements rpm scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts files one by one from the package type and launches recursive scan

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures

credsweeper.deep_scanner.rtf_scanner module

class credsweeper.deep_scanner.rtf_scanner.RtfScanner[source]

Bases: AbstractScanner, ABC

Implements squash file system scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Scans data as RTF

static get_lines(text: str) → List[str][source]: Extracts text lines from RTF format

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures - Rich Text Format

credsweeper.deep_scanner.sqlite3_scanner module

class credsweeper.deep_scanner.sqlite3_scanner.Sqlite3Scanner[source]

Bases: AbstractScanner, ABC

Implements SQLite3 database scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts data file from .ar (debian) archive and launches data_scan

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures - SQLite Database

static walk_sqlite(data: bytes) → Generator[Tuple[str, Any], None, None][source]: Yields data from sqlite3 database

credsweeper.deep_scanner.strings_scanner module

class credsweeper.deep_scanner.strings_scanner.StringsScanner[source]

Bases: AbstractScanner, ABC

Implements known binary file scanning with ASCII strings representations

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Scan binary files for ASCII strings

static get_enumerated_lines(data: bytes) → List[Tuple[int, str]][source]: Processes binary to found ASCII strings. Use offset instead line number.

static match(data: bytes | bytearray) → bool[source]: Match logic in data_scan

credsweeper.deep_scanner.tar_scanner module

class credsweeper.deep_scanner.tar_scanner.TarScanner[source]

Bases: AbstractScanner, ABC

Implements tar scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts files one by one from tar archive and launches data_scan

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures

credsweeper.deep_scanner.tmx_scanner module

class credsweeper.deep_scanner.tmx_scanner.TmxScanner[source]

Bases: AbstractScanner, ABC

Realises tmX files scanning for values only. Image tags are skipped.

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to represent data as xml text and scan as text lines

static match(data: bytes | bytearray) → bool[source]: Used to detect tm7,tm6,etc. (ThreadModeling) format.

credsweeper.deep_scanner.xls_scanner module

class credsweeper.deep_scanner.xls_scanner.XlsScanner[source]

Bases: PandasScanner, ABC

Implements xls matching

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures

credsweeper.deep_scanner.xlsx_scanner module

class credsweeper.deep_scanner.xlsx_scanner.XlsxScanner[source]

Bases: PandasScanner, ABC

Implements xlsx matching

static match(data: bytes | bytearray) → bool[source]: Assume, ZIP prefix and common office files were checked before

credsweeper.deep_scanner.xml_scanner module

class credsweeper.deep_scanner.xml_scanner.XmlScanner[source]

Bases: AbstractScanner, ABC

Realises xml scanning

XML_FIRST_BRACKET_PATTERN = re.compile(b'^\\s*<')

XML_OPENING_TAG_PATTERN = re.compile(b'<([0-9A-Za-z_]{1,256})')

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Tries to represent data as xml text and scan as text lines

static match(data: bytes | bytearray) → bool[source]: Used to detect xml format from raw bytes

credsweeper.deep_scanner.zip_scanner module

class credsweeper.deep_scanner.zip_scanner.ZipScanner[source]

Bases: AbstractScanner, ABC

Implements zip scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Extracts files one by one from zip archives and launches data_scan

static get_size(data: bytes | bytearray) → int[source]

Evaluate extracted archive size

Returns: size of data or -1 in failure case

static match(data: bytes | bytearray) → bool[source]: According https://en.wikipedia.org/wiki/List_of_file_signatures

credsweeper.deep_scanner.zlib_scanner module

class credsweeper.deep_scanner.zlib_scanner.ZlibScanner[source]

Bases: AbstractScanner, ABC

Implements zlib data inflate and scan

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) → List[Candidate] | None[source]: Inflate data from zlib compressed and launches data_scan

static decompress(limit: int, data: bytes) → bytes[source]: Returns decompressed data by chunks with a limit or exception in unusual cases

static match(data: bytes | bytearray) → bool[source]: Returns True if data looks like deflated data with zlib

credsweeper.deep_scanner package

Submodules

credsweeper.deep_scanner.abstract_scanner module

credsweeper.deep_scanner.byte_scanner module

credsweeper.deep_scanner.bzip2_scanner module

credsweeper.deep_scanner.crx_scanner module

credsweeper.deep_scanner.csv_scanner module

credsweeper.deep_scanner.deb_scanner module

credsweeper.deep_scanner.deep_scanner module

credsweeper.deep_scanner.docx_scanner module

credsweeper.deep_scanner.eml_scanner module

credsweeper.deep_scanner.encoder_scanner module

credsweeper.deep_scanner.gzip_scanner module

credsweeper.deep_scanner.html_scanner module

credsweeper.deep_scanner.jclass_scanner module

credsweeper.deep_scanner.jks_scanner module

credsweeper.deep_scanner.lang_scanner module

credsweeper.deep_scanner.lzma_scanner module

credsweeper.deep_scanner.mxfile_scanner module

credsweeper.deep_scanner.ods_scanner module

credsweeper.deep_scanner.pandas_scanner module

credsweeper.deep_scanner.patch_scanner module

credsweeper.deep_scanner.pdf_scanner module

credsweeper.deep_scanner.pkcs_scanner module

credsweeper.deep_scanner.png_scanner module

credsweeper.deep_scanner.pptx_scanner module

credsweeper.deep_scanner.protobuf_scanner module

credsweeper.deep_scanner.rpm_scanner module

credsweeper.deep_scanner.rtf_scanner module

credsweeper.deep_scanner.sqlite3_scanner module

credsweeper.deep_scanner.strings_scanner module

credsweeper.deep_scanner.tar_scanner module

credsweeper.deep_scanner.tmx_scanner module

credsweeper.deep_scanner.xls_scanner module

credsweeper.deep_scanner.xlsx_scanner module

credsweeper.deep_scanner.xml_scanner module

credsweeper.deep_scanner.zip_scanner module

credsweeper.deep_scanner.zlib_scanner module

Module contents