Files
BEHAVE/CHANGELOG.md
anti b182e2fe3b feat(text): add meta.* corpus-footprint layer and 4 language-aware primitives (v0.1.3)
Adds 12 new primitives across two waves of spec work this session.

meta.* layer (8 primitives) — corpus-snapshot footprint:
  total_messages, corpus_span_days, msg_per_day, active_days,
  activity_density, first_seen_ts, last_seen_ts, fingerprint_confidence.
  Motivated by two actors with identical message counts (53 each) producing
  indistinguishable profiles despite radically different presence shapes
  (0.3-day burst vs 47-day long tail).

Language-aware characterization primitives (4 primitives):
  stylometric.pos_ngram_signature — SimHash over POS bigram frequency vector;
    syntactic skeleton fingerprint that survives full vocabulary paraphrase.
  lexical.dialect_region — BCP-47 free_string (es-CL, es-AR, es-MX, …);
    designed for EYENET integration with INGEOTEC regional-spanish-models.
  lexical.evaluative_morphology_density — diminutive/augmentative/pejorative
    suffix density; stable per-author trait baked into language acquisition.
  lexical.optional_grammar_signature — SimHash over optional-grammar choice
    points (compound/simple past, subjunctive, leísmo, relative pronoun);
    high-reliability Spain vs LatAm discriminator.

Also fixes stale scratchpad.md references throughout (README.md is now the
authority), bumps behave-text to 0.1.3, and updates CHANGELOG.
2026-05-23 01:54:12 -04:00

68 lines
2.7 KiB
Markdown

# Changelog
All notable changes to BEHAVE packages are documented here.
Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
---
## [behave-text 0.1.3] — 2026-05-23
### behave-text
#### Added
- `stylometric.pos_ngram_signature` — 64-bit SimHash over POS n-gram (default bigram)
frequency vector. Captures syntactic skeleton independent of vocabulary. Tagger-dependent;
source label must declare tagger + model + n. Calibration note: noisy on chat-domain text,
weight low until validated.
- `lexical.dialect_region` — BCP-47 language-region free_string (`es-CL`, `es-AR`, `es-MX`,
`es-ES`, `en-US`, etc.) for the actor's dominant regional variety, detected from lexical
marker density. Emit `unknown` below confidence threshold. Designed for EYENET integration
with INGEOTEC `regional-spanish-models` vocabulary tables (MIT).
- `lexical.evaluative_morphology_density` — numeric [0,1] rate of evaluative morpheme tokens
(diminutives, augmentatives, pejoratives, intensives) per total tokens. Stable per-author
trait baked into language acquisition; strong Spain/LatAm regional discriminator.
- `lexical.optional_grammar_signature` — 64-bit SimHash over author preference probabilities
at optional-grammar choice points (for Spanish: compound vs simple past, subjunctive usage,
leísmo/laísmo/loísmo, relative pronoun choice). Choice-point set is extractor-defined and
declared in source label.
---
## [behave-text 0.1.2] — 2026-05-23
### behave-text
#### Added
- `meta.*` layer — 8 new corpus-snapshot primitives: `total_messages`, `corpus_span_days`,
`msg_per_day`, `active_days`, `activity_density`, `first_seen_ts`, `last_seen_ts`,
`fingerprint_confidence`. Fills the gap between actors with identical message counts but
radically different presence shapes (bursty single-session vs long-tail lurker).
#### Fixed
- Stale `scratchpad.md` references in `primitives.py` docstring, `tests/test_primitives.py`
docstring, and `attribution-recipes.md``README.md` is now the authority.
---
## [0.1.0] — 2026-05-17
Initial public release of all three packages.
### behave-core
- Shared observation envelope and schema contract (`BehaveObservation`)
- Pydantic v2 base models for domain-agnostic behavioral records
### behave-shell
- Shell-session behavioral observation registry
- Primitive catalog covering command execution, session lifecycle, environment, and navigation events
- Layered on `behave-core`
### behave-text
- Text/messaging-domain behavioral observation registry
- Primitive catalog covering message composition, conversation, and metadata events
- Layered on `behave-core`