credsweeper.deep_scanner package

Submodules

credsweeper.deep_scanner.abstract_scanner module

class credsweeper.deep_scanner.abstract_scanner.AbstractScanner[source]

Bases: ABC

Base abstract class for all recursive scanners

abstract property config: Config

Abstract property to be defined in DeepScanner

abstract data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Abstract method to be defined in DeepScanner

deep_scan_with_fallback(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate][source]

Scans with deep scanners and fallback scanners if possible

Parameters:
  • data_provider – DataContentProvider with raw data

  • depth – maximal level of recursion

  • recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack

Returns: list with candidates

abstract static get_deep_scanners(data: bytes, descriptor: Descriptor, depth: int) Tuple[List[Any], List[Any]][source]

Returns possibly scan methods for the data depends on content and fallback scanners

static key_value_combination(structure: dict) Generator[Tuple[Any, Any], None, None][source]

Combine items by key and value from a dictionary for augmentation {…, “key”: “api_key”, “value”: “XXXXXXX”, …} -> (“api_key”, “XXXXXXX”)

recursive_scan(data_provider: DataContentProvider, depth: int = 0, recursive_limit_size: int = 0) List[Candidate][source]

Recursive function to scan files which might be containers like ZIP archives

Parameters:
  • data_provider – DataContentProvider object may be a container

  • depth – maximal level of recursion

  • recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack

scan(content_provider: ContentProvider, depth: int, recursive_limit_size: int | None = None) List[Candidate][source]

Initial scan method to launch recursive scan. Skips ByteScanner to prevent extra scan

Parameters:
  • content_provider – ContentProvider that might contain raw data

  • depth – maximal level of recursion

  • recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack

abstract property scanner: Scanner

Abstract property to be defined in DeepScanner

static structure_processing(structure: Any) Generator[Tuple[Any, Any], None, None][source]

Yields pair key, value from given structure if applicable

structure_scan(struct_provider: StructContentProvider, depth: int, recursive_limit_size: int) List[Candidate][source]

Recursive function to scan structured data

Parameters:
  • struct_provider – DataContentProvider object may be a container

  • depth – maximal level of recursion

  • recursive_limit_size – maximal bytes of opened files to prevent recursive zip-bomb attack

credsweeper.deep_scanner.byte_scanner module

class credsweeper.deep_scanner.byte_scanner.ByteScanner[source]

Bases: AbstractScanner, ABC

Implements plain data scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to represent data as plain text with splitting by lines and scan as text lines

credsweeper.deep_scanner.bzip2_scanner module

class credsweeper.deep_scanner.bzip2_scanner.Bzip2Scanner[source]

Bases: AbstractScanner, ABC

Implements bzip2 scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts data from bzip2 archive and launches data_scan

credsweeper.deep_scanner.csv_scanner module

class credsweeper.deep_scanner.csv_scanner.CsvScanner[source]

Bases: AbstractScanner, ABC

Implements CSV scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan each row as structure with column name in key

delimiters = ',;\t|\x1f'
classmethod get_structure(text: str) List[Dict[str, Any]][source]

Reads a text as CSV standard with guessed dialect

sniffer = <csv.Sniffer object>

credsweeper.deep_scanner.deb_scanner module

class credsweeper.deep_scanner.deb_scanner.DebScanner[source]

Bases: AbstractScanner, ABC

Implements deb (ar) scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts data file from .ar (debian) archive and launches data_scan

static walk_deb(data: bytes) Generator[Tuple[int, str, bytes], None, None][source]

Processes sequence of DEB archive and yields offset, name and data

credsweeper.deep_scanner.deep_scanner module

class credsweeper.deep_scanner.deep_scanner.DeepScanner(config: Config, scanner: Scanner)[source]

Bases: ByteScanner, Bzip2Scanner, DocxScanner, CsvScanner, EncoderScanner, GzipScanner, HtmlScanner, JclassScanner, JksScanner, LangScanner, LzmaScanner, PatchScanner, PdfScanner, PkcsScanner, PptxScanner, RtfScanner, RpmScanner, Sqlite3Scanner, StringsScanner, TarScanner, DebScanner, XmlScanner, XlsxScanner, ZipScanner

Advanced scanner with recursive exploring of data

property config: Config

Abstract property to be defined in DeepScanner

static get_deep_scanners(data: bytes, descriptor: Descriptor, depth: int) Tuple[List[Any], List[Any]][source]

Returns possibly scan methods for the data depends on content and fallback scanners

property scanner: Scanner

Abstract property to be defined in DeepScanner

credsweeper.deep_scanner.docx_scanner module

class credsweeper.deep_scanner.docx_scanner.DocxScanner[source]

Bases: AbstractScanner, ABC

Implements docx scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan DOCX text with splitting by lines

credsweeper.deep_scanner.eml_scanner module

class credsweeper.deep_scanner.eml_scanner.EmlScanner[source]

Bases: AbstractScanner, ABC

Implements eml scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan EML with text representation

credsweeper.deep_scanner.encoder_scanner module

class credsweeper.deep_scanner.encoder_scanner.EncoderScanner[source]

Bases: AbstractScanner, ABC

Implements recursive iteration when data might be encoded

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to decode data from base64 encode to bytes and scan as bytes again

credsweeper.deep_scanner.gzip_scanner module

class credsweeper.deep_scanner.gzip_scanner.GzipScanner[source]

Bases: AbstractScanner, ABC

Realises gzip scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts data from gzip archive and launches data_scan

credsweeper.deep_scanner.html_scanner module

class credsweeper.deep_scanner.html_scanner.HtmlScanner[source]

Bases: AbstractScanner, ABC

Implements html scanning if possible

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to represent data as html text and scan as text lines

credsweeper.deep_scanner.jclass_scanner module

class credsweeper.deep_scanner.jclass_scanner.JclassScanner[source]

Bases: AbstractScanner, ABC

Implements java .class scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts data from binary

static get_utf8_constants(stream: BytesIO) List[str][source]

Extracts only Utf8 constants from java ClassFile

static u2(stream: BytesIO) int[source]

Extracts unsigned 16 bit big-endian

credsweeper.deep_scanner.jks_scanner module

class credsweeper.deep_scanner.jks_scanner.JksScanner[source]

Bases: AbstractScanner, ABC

Implements jks scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan JKS to open with standard password

credsweeper.deep_scanner.lang_scanner module

class credsweeper.deep_scanner.lang_scanner.LangScanner[source]

Bases: AbstractScanner, ABC

Implements scanning of data if it is a script of some markup language

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to represent data as markup language and scan as structure

credsweeper.deep_scanner.lzma_scanner module

class credsweeper.deep_scanner.lzma_scanner.LzmaScanner[source]

Bases: AbstractScanner, ABC

Implements lzma scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts data from lzma archive and launches data_scan

credsweeper.deep_scanner.mxfile_scanner module

class credsweeper.deep_scanner.mxfile_scanner.MxfileScanner[source]

Bases: AbstractScanner, ABC

Scanner for drawio diagram

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to get text data from the xml format

credsweeper.deep_scanner.patch_scanner module

class credsweeper.deep_scanner.patch_scanner.PatchScanner[source]

Bases: AbstractScanner, ABC

Implements .patch scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan EML with text representation

credsweeper.deep_scanner.pdf_scanner module

class credsweeper.deep_scanner.pdf_scanner.PdfScanner[source]

Bases: AbstractScanner, ABC

Implements pdf scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan PDF elements recursively and the whole text on page as strings

credsweeper.deep_scanner.pkcs_scanner module

class credsweeper.deep_scanner.pkcs_scanner.PkcsScanner[source]

Bases: AbstractScanner, ABC

Implements pkcs12 scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan PKCS12 to open with standard password

credsweeper.deep_scanner.pptx_scanner module

class credsweeper.deep_scanner.pptx_scanner.PptxScanner[source]

Bases: AbstractScanner, ABC

Implements pptx scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan pptx text elements for all slides

credsweeper.deep_scanner.rpm_scanner module

class credsweeper.deep_scanner.rpm_scanner.RpmScanner[source]

Bases: AbstractScanner, ABC

Implements rpm scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts files one by one from the package type and launches recursive scan

credsweeper.deep_scanner.rtf_scanner module

class credsweeper.deep_scanner.rtf_scanner.RtfScanner[source]

Bases: AbstractScanner, ABC

Implements squash file system scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Scans data as RTF

static get_lines(text: str) List[str][source]

Extracts text lines from RTF format

credsweeper.deep_scanner.sqlite3_scanner module

class credsweeper.deep_scanner.sqlite3_scanner.Sqlite3Scanner[source]

Bases: AbstractScanner, ABC

Implements SQLite3 database scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts data file from .ar (debian) archive and launches data_scan

static walk_sqlite(data: bytes) Generator[Tuple[str, Any], None, None][source]

Yields data from sqlite3 database

credsweeper.deep_scanner.strings_scanner module

class credsweeper.deep_scanner.strings_scanner.StringsScanner[source]

Bases: AbstractScanner, ABC

Implements known binary file scanning with ASCII strings representations

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts data file from .ar (debian) archive and launches data_scan

static get_strings(data: bytes) List[Tuple[str, int]][source]

Processes binary to found ASCII strings. Use offset instead line number.

credsweeper.deep_scanner.tar_scanner module

class credsweeper.deep_scanner.tar_scanner.TarScanner[source]

Bases: AbstractScanner, ABC

Implements tar scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts files one by one from tar archive and launches data_scan

credsweeper.deep_scanner.tmx_scanner module

class credsweeper.deep_scanner.tmx_scanner.TmxScanner[source]

Bases: AbstractScanner, ABC

Realises tmX files scanning for values only. Image tags are skipped.

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to represent data as xml text and scan as text lines

credsweeper.deep_scanner.xlsx_scanner module

class credsweeper.deep_scanner.xlsx_scanner.XlsxScanner[source]

Bases: AbstractScanner, ABC

Implements xlsx scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to scan xlsx text elements for all slides

credsweeper.deep_scanner.xml_scanner module

class credsweeper.deep_scanner.xml_scanner.XmlScanner[source]

Bases: AbstractScanner, ABC

Realises xml scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Tries to represent data as xml text and scan as text lines

credsweeper.deep_scanner.zip_scanner module

class credsweeper.deep_scanner.zip_scanner.ZipScanner[source]

Bases: AbstractScanner, ABC

Implements zip scanning

data_scan(data_provider: DataContentProvider, depth: int, recursive_limit_size: int) List[Candidate] | None[source]

Extracts files one by one from zip archives and launches data_scan

Module contents