credsweeper.utils package
Submodules
credsweeper.utils.hop_stat module
- class credsweeper.utils.hop_stat.HopStat[source]
Bases:
objectStatistical check distances between symbols sequence in a value on keyboard layout
- KEYBOARD = ('`1234567890-=', '\x00qwertyuiop[]\\', "\x00\x00asdfghjkl;'", '\x00\x00zxcvbnm,./')
- TRANSLATION = {33: '1', 34: "'", 35: '3', 36: '4', 37: '5', 38: '7', 40: '9', 41: '0', 42: '8', 43: '=', 58: ';', 60: ',', 62: '.', 63: '/', 64: '2', 65: 'a', 66: 'b', 67: 'c', 68: 'd', 69: 'e', 70: 'f', 71: 'g', 72: 'h', 73: 'i', 74: 'j', 75: 'k', 76: 'l', 77: 'm', 78: 'n', 79: 'o', 80: 'p', 81: 'q', 82: 'r', 83: 's', 84: 't', 85: 'u', 86: 'v', 87: 'w', 88: 'x', 89: 'y', 90: 'z', 94: '6', 95: '-', 123: '[', 124: '\\', 125: ']', 126: '`'}
credsweeper.utils.pem_key_detector module
- class credsweeper.utils.pem_key_detector.PemKeyDetector(config: Config)[source]
Bases:
objectClass to detect PEM PRIVATE keys only
- BASE64_CHARS_SET = {'+', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '=', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'}
- ENTROPY_LIMIT_BASE64 = 4.5
- IGNORE_STARTS = ['-----BEGIN', 'Proc-Type', 'Version', 'DEK-Info']
- MAX_PEM_LENGTH = 32000
- REMOVE_CHARACTERS = ' \t\n\r\x0b\x0c\\\'"`;,[]#*!'
- RE_BASE64_CHARS = re.compile('[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789\\+/=]+')
- RE_PEM_BEGIN = re.compile('(?P<value>-----BEGIN(?![^-]{1,80}ENCRYPTED)[^-]{0,80}PRIVATE[^-]{1,80}KEY[^-]{0,80}-----(.{1,8000}-----END[^-]{1,80}KEY[^-]{0,80}-----)?)')
- RE_PEM_VALUE = re.compile('(?P<value>.{0,32000})')
- WRAP_CHARACTERS = '\\\'"`;,[]#*!'
- detect_pem_key(first_line: LineData, target: AnalysisTarget) List[LineData][source]
Detects PEM key in single line and with iterative for next lines according https://www.rfc-editor.org/rfc/rfc7468
- Parameters:
first_line – detected —–BEGIN from rule pattern
target – Analysis target
- Returns:
List of LineData with found PEM
- static finalize(line_data_list: List[LineData], key_data_list: List[str], last_line: str) List[LineData][source]
Checks collected key_data according the key type
- static is_leading_config_line(line: str) bool[source]
Remove non-key lines from the beginning of a list.
Example lines with non-key leading lines:
Proc-Type: 4,ENCRYPTED DEK-Info: DEK-Info: AES-256-CBC,2AA219GG746F88F6DDA0D852A0FD3211 ZZAWarrA1...
- Parameters:
line – Line to be checked
- Returns:
True if the line is not a part of encoded data but leading config
- static sanitize_line(line: str, recurse_level: int = 5) str[source]
Remove common symbols that can surround PEM keys inside code.
Examples:
`# ZZAWarrA1` `* ZZAWarrA1` ` "ZZAWarrA1\n" + `
- Parameters:
line – Line to be cleaned
recurse_level – to avoid infinite loop in case when removed symbol inside base64 encoded
- Returns:
line with special characters removed from both ends
credsweeper.utils.util module
- class credsweeper.utils.util.Util[source]
Bases:
objectClass that contains different useful methods.
- MIN_DATA_ENTROPY: Dict[int, float] = {16: 1.66973671780348, 20: 2.07723544540831, 32: 3.25392803184602, 40: 3.64853567064867, 64: 4.57756933688035, 384: 7.39, 512: 7.55}
- NOT_LATIN1_PRINTABLE_SET = {0, 1, 2, 3, 4, 5, 6, 7, 8, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159}
- PEM_CLEANING_PATTERN = re.compile('\\\\[tnrvf]')
- RANDOM_DATA = b'5\xde\xb6sn\xd2.Uq.\x80\x1e\xa2\xec\x87\x12h\x13\x15p'
- WHITESPACE_TRANS_TABLE = {9: None, 10: None, 11: None, 12: None, 13: None, 32: None}
- static check_pk(pkey: DHPrivateKey | Ed25519PrivateKey | Ed448PrivateKey | MLDSA44PrivateKey | MLDSA65PrivateKey | MLDSA87PrivateKey | MLKEM768PrivateKey | MLKEM1024PrivateKey | RSAPrivateKey | DSAPrivateKey | EllipticCurvePrivateKey | X25519PrivateKey | X448PrivateKey) bool[source]
Check private key with encrypt-decrypt random data
- static decode_base64(text: str, padding_safe: bool = False, urlsafe_detect=False) bytes[source]
decode text to bytes with / without padding detect and urlsafe symbols
- static decode_bytes(content: bytes | None, encodings: List[str] | None = None) List[str][source]
Decode content using different encodings.
Try to decode bytes according to the list of encodings “encodings” occurs without any exceptions. UTF-16 requires BOM
- Parameters:
content – raw data that might be text
encodings – supported encodings
- Returns:
list of file rows in a suitable encoding from “encodings”, if none of the encodings match, an empty list will be returned Also empty list will be returned after last encoding and 0 symbol is present in lines not at end
- static decode_text(content: bytes | None, encodings: List[str] | None = None) str | None[source]
Decode content using different encodings.
Try to decode bytes according to the list of encodings “encodings” occurs without any exceptions. UTF-16 requires BOM
- Parameters:
content – raw data that might be text
encodings – supported encodings
- Returns:
Decoded text in str for any suitable encoding or None when binary data detected
- static extract_element_data(element: Any, attr: str) str[source]
Extract xml element data to string.
Try to extract the xml data and strip() the string.
- Parameters:
element – xml element
attr – attribute name
- Returns:
String xml data with strip()
- static get_asn1_size(data: bytes | bytearray) int[source]
Only sequence type 0x30 and size correctness are checked Returns size of ASN1 data over 128 bytes or 0 if no interested data
- static get_chunks(line_len: int) List[Tuple[int, int]][source]
Returns chunks positions for given line length
- static get_excel_column_name(column_index: int) str[source]
Converts index based column position into Excel style column name
- static get_extension(file_path: str, lower=True) str[source]
Return extension of file in lower case by default e.g.: ‘.txt’, ‘.JPG’
- static get_min_data_entropy(x: int) float[source]
Returns minimal entropy for size of random data. Precalculated data is applied for speedup
- static get_shannon_entropy(data: str | bytes) float[source]
Borrowed from http://blog.dkbza.org/2007/05/scanning-data-for-entropy-anomalies.html.
- static get_xml_from_lines(xml_lines: List[str]) Tuple[List[str] | None, List[int] | None][source]
Parse xml data from list of string and return List of str.
- Parameters:
xml_lines – list of lines of xml data
- Returns:
{root.text}”)
- Return type:
List of formatted string(f”{root.tag}
- Raises:
xml exception –
- static is_ascii_entropy_validate(data: bytes) bool[source]
Tests small data sequence (<256) for data randomness by testing for ascii and shannon entropy Returns True when data is an ASCII symbols or have small entropy
- static is_binary(data: bytes | bytearray) bool[source]
Returns True when two zeroes sequence is found in begin of data. The sequence never exists in text format (UTF-8, UTF-16). UTF-32 is not supported.
- static is_latin1(data: bytes | bytearray) bool[source]
Returns True when data looks like LATIN-1 for first MAX_LINE_LENGTH bytes.
- static json_dump(obj: Any, file_path: str | Path, encoding='utf_8', indent=4) None[source]
Write dictionary to JSON file
- static json_load(file_path: str | Path, encoding='utf_8') Any[source]
Load dictionary from JSON file
- static load_pk(data: bytes, password: bytes | None = None) DHPrivateKey | Ed25519PrivateKey | Ed448PrivateKey | MLDSA44PrivateKey | MLDSA65PrivateKey | MLDSA87PrivateKey | MLKEM768PrivateKey | MLKEM1024PrivateKey | RSAPrivateKey | DSAPrivateKey | EllipticCurvePrivateKey | X25519PrivateKey | X448PrivateKey | None[source]
Try to load private key from PKCS1, PKCS8 and PKCS12 formats
- static parse_python(source: str) List[Any][source]
Parse Python source and back to remove strings merge and line wrap
- static read_data(path: str | Path) bytes | None[source]
Read the file bytes as is.
Try to read the data of the file.
- Parameters:
path – path to file
- Returns:
list of file rows in a suitable encoding from “encodings”, if none of the encodings match, an empty list will be returned
- static read_file(path: str | Path, encodings: List[str] | None = None) List[str][source]
Read the file content using different encodings.
Try to read the contents of the file according to the list of encodings “encodings” as soon as reading occurs without any exceptions, the data is returned in the current encoding
- Parameters:
path – path to file
encodings – supported encodings
- Returns:
list of file rows in a suitable encoding from “encodings”, if none of the encodings match, an empty list will be returned
- static split_text(text: str) List[str][source]
Splits a text into lines, handling all common line endings (e.g., LF, CRLF, CR).
- static subtext(text: str, pos: int, hunk_size: int) str[source]
cut text symmetrically for given position or use remained quota to be fitted in 2x hunk_size