credsweeper.ml_model.features package

Submodules

credsweeper.ml_model.features.entropy_evaluation module

class credsweeper.ml_model.features.entropy_evaluation.EntropyEvaluation[source]

Bases: Feature

Renyi, Shannon entropy evaluation with Hartley entropy normalization. Augmentation with possible set of chars (hex, base64, etc.) Analyse only begin of the value

See next link for details: https://digitalassets.lib.berkeley.edu/math/ucb/text/math_s4_v1_article-27.pdf

extract(candidate: Candidate) ndarray[source]

Returns real entropy and possible sets of characters

credsweeper.ml_model.features.feature module

class credsweeper.ml_model.features.feature.Feature[source]

Bases: ABC

Base class for features.

abstract extract(candidate: Candidate) Any[source]

Abstract method of base class

credsweeper.ml_model.features.file_extension module

class credsweeper.ml_model.features.file_extension.FileExtension(extensions: List[str])[source]

Bases: WordIn

Categorical feature of file type.

Parameters:

extensions – extension labels

extract(candidate: Candidate) Any[source]

Abstract method of base class

credsweeper.ml_model.features.has_html_tag module

class credsweeper.ml_model.features.has_html_tag.HasHtmlTag[source]

Bases: WordIn

Feature is true if line has HTML tags (HTML file).

HTML_WORDS = ['< img', '<img', '< script', '<script', '< p', '<p', '< link', '<link', '< meta', '<meta', '< a', '<a']
extract(candidate: Candidate) float[source]

Abstract method of base class

credsweeper.ml_model.features.is_secret_numeric module

class credsweeper.ml_model.features.is_secret_numeric.IsSecretNumeric[source]

Bases: Feature

Feature is true if candidate value is a numerical value.

extract(candidate: Candidate) float[source]

Abstract method of base class

credsweeper.ml_model.features.length_of_attribute module

class credsweeper.ml_model.features.length_of_attribute.LengthOfAttribute(attribute: str)[source]

Bases: Feature

Abstract class for obtain a normalized value of length with max size of hunk

extract(candidate: Candidate) ndarray[source]

Returns boolean for first LineData member

credsweeper.ml_model.features.morpheme_dense module

class credsweeper.ml_model.features.morpheme_dense.MorphemeDense[source]

Bases: Feature

Feature calculates morphemes density for a value

extract(candidate: Candidate) float[source]

Abstract method of base class

credsweeper.ml_model.features.rule_name module

class credsweeper.ml_model.features.rule_name.RuleName(rule_names: List[str])[source]

Bases: WordIn

Categorical feature that corresponds to rule name.

Parameters:

rule_names – rule name labels

extract(candidate: Candidate) Any[source]

Abstract method of base class

credsweeper.ml_model.features.rule_severity module

class credsweeper.ml_model.features.rule_severity.RuleSeverity[source]

Bases: Feature

Categorical feature that corresponds to rule name.

extract(candidate: Candidate) float[source]

Abstract method of base class

credsweeper.ml_model.features.search_in_attribute module

class credsweeper.ml_model.features.search_in_attribute.SearchInAttribute(pattern: str, attribute: str)[source]

Bases: Feature

Abstract feature returns boolean for matched pattern in member of first LineData

extract(candidate: Candidate) float[source]

Returns boolean for first LineData member

credsweeper.ml_model.features.word_in module

class credsweeper.ml_model.features.word_in.WordIn(words: List[str])[source]

Bases: Feature

Abstract feature returns array with all matched words in a string

abstract extract(candidate: Candidate) Any[source]

Abstract method of base class

word_in_(iterable_data: str | List[str] | Set[str]) ndarray[source]

Returns array with words included in a string

property zero: ndarray

Returns zero filled array for case of empty input

credsweeper.ml_model.features.word_in_path module

class credsweeper.ml_model.features.word_in_path.WordInPath(words: List[str])[source]

Bases: WordIn

Categorical feature that corresponds to words in path (POSIX, lowercase)

extract(candidate: Candidate) Any[source]

Abstract method of base class

credsweeper.ml_model.features.word_in_postamble module

class credsweeper.ml_model.features.word_in_postamble.WordInPostamble(words: List[str])[source]

Bases: WordIn

Feature is true if line contains at least one word from predefined list.

extract(candidate: Candidate) ndarray[source]

Returns true if any words in a part of line after value

credsweeper.ml_model.features.word_in_preamble module

class credsweeper.ml_model.features.word_in_preamble.WordInPreamble(words: List[str])[source]

Bases: WordIn

Feature is true if line contains at least one word from predefined list.

extract(candidate: Candidate) ndarray[source]

Returns true if any words in line before variable or value

credsweeper.ml_model.features.word_in_transition module

class credsweeper.ml_model.features.word_in_transition.WordInTransition(words: List[str])[source]

Bases: WordIn

Feature is true if line contains at least one word from predefined list.

extract(candidate: Candidate) ndarray[source]

Returns true if any words between variable and value

credsweeper.ml_model.features.word_in_value module

class credsweeper.ml_model.features.word_in_value.WordInValue(words: List[str])[source]

Bases: WordIn

Feature returns true if candidate value contains at least one word from predefined list.

extract(candidate: Candidate) ndarray[source]

Returns array of matching words for first line

credsweeper.ml_model.features.word_in_variable module

class credsweeper.ml_model.features.word_in_variable.WordInVariable(words: List[str])[source]

Bases: WordIn

Feature returns array of words matching in variable

extract(candidate: Candidate) ndarray[source]

Returns array of matching words for first line

Module contents