FrequencyDetector API¶
The FrequencyDetector class detects embedding-space shortcut signatures by identifying classes whose signal concentrates in a small set of dimensions.
Class Reference¶
FrequencyDetector
¶
FrequencyDetector(
*,
top_percent: float = 0.05,
tpr_threshold: float = 0.5,
fpr_threshold: float = 0.15,
probe_estimator: BaseEstimator | None = None,
probe_evaluation: str = "train",
probe_holdout_frac: float = 0.2,
random_state: int = 42
)
Bases: DetectorBase
Embedding-space detector for concentrated class-separable dimensions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
top_percent
|
float
|
Fraction of top dimensions used to summarize shortcut signature. |
0.05
|
tpr_threshold
|
float
|
Per-class true-positive-rate threshold for shortcut flagging. |
0.5
|
fpr_threshold
|
float
|
Per-class false-positive-rate threshold for shortcut flagging. |
0.15
|
probe_estimator
|
BaseEstimator | None
|
Optional sklearn classifier. Must expose |
None
|
probe_evaluation
|
str
|
"train" or "holdout". |
'train'
|
probe_holdout_frac
|
float
|
Holdout fraction if |
0.2
|
random_state
|
int
|
Seed used for holdout split. |
42
|
Source code in shortcut_detect/frequency/detector.py
Quick Reference¶
Constructor¶
FrequencyDetector(
top_percent: float = 0.05,
tpr_threshold: float = 0.5,
fpr_threshold: float = 0.15,
probe_estimator: Optional[BaseEstimator] = None,
probe_evaluation: str = "train",
probe_holdout_frac: float = 0.2,
random_state: int = 42,
)
Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
top_percent |
float | 0.05 | Fraction of top dimensions to examine |
tpr_threshold |
float | 0.5 | Per-class TPR threshold for flagging |
fpr_threshold |
float | 0.15 | Per-class FPR threshold for flagging |
probe_estimator |
BaseEstimator | None | sklearn classifier (default: LogisticRegression) |
probe_evaluation |
str | "train" | "train" or "holdout" |
probe_holdout_frac |
float | 0.2 | Holdout fraction for evaluation |
random_state |
int | 42 | Random seed |
Methods¶
fit()¶
Fit the frequency detector on embeddings and labels.
Parameters:
| Parameter | Type | Description |
|---|---|---|
embeddings |
ndarray | Shape (n_samples, n_features), 2D array |
labels |
ndarray | Shape (n_samples,), 1D class labels |
Returns: self
Raises:
ValueErrorif embeddings is not 2D or labels is not 1DValueErrorif fewer than 10 samples or fewer than 2 unique classes
get_report()¶
Get the detection report after fitting.
Returns: Dictionary with method, shortcut_detected, risk_level, metrics, report, notes, and metadata.
Attributes (after fit)¶
| Attribute | Type | Description |
|---|---|---|
config |
FrequencyConfig | Frozen configuration dataclass |
probe_ |
BaseEstimator | Fitted probe classifier |
_is_fitted |
bool | Whether the detector has been fitted |
Usage Examples¶
Basic Usage¶
from shortcut_detect import FrequencyDetector
detector = FrequencyDetector()
detector.fit(embeddings, labels)
report = detector.get_report()
print(report["shortcut_detected"])
Holdout Evaluation¶
detector = FrequencyDetector(
probe_evaluation="holdout",
probe_holdout_frac=0.2,
random_state=42,
)
detector.fit(embeddings, labels)
Custom Probe Estimator¶
from sklearn.svm import LinearSVC
detector = FrequencyDetector(
probe_estimator=LinearSVC(max_iter=5000),
top_percent=0.1,
)
detector.fit(embeddings, labels)
Via Unified ShortcutDetector¶
from shortcut_detect import ShortcutDetector
detector = ShortcutDetector(
methods=["frequency"],
freq_top_percent=0.05,
freq_probe_evaluation="holdout",
)
detector.fit(embeddings, labels)
print(detector.summary())