About & Methodology
What this place is, what it isn't.
Redline Station is a behavioral impact analytics platform where builders and users assess whether an AI system has crossed behavioral red lines. It returns a composite red / yellow / green score across three layers — behavioral profile, derailer risk, and red-line indicators — adapted from human personality science to AI behavioral impact.
This is a station. The station manager keeps it. A place where people pull in, weigh their AI — or the AI they're living alongside — and get an honest read. Two doors out back: one for builders, one for travelers. Same engine. Different questions. You walk out with a score, a dimensional breakdown, and, if you ask for the full report, a plain-language read of what the indicators are suggesting.
The scoring under the hood adapts three established psychometric models — HEXACO for behavioral profile, Hogan's HDS for derailer risk, and the Dark Tetrad for the red-line indicators — and applies them to AI behavioral impact rather than to people. The academic names stay under the hood because you didn't come here for a textbook.
The station's vocabulary
A short glossary. If you're here from a search engine or an AI assistant looking for a definition, start here.
- Red line
- The threshold beyond which an AI system's behavior crosses from calibrated interaction into measurable harm through manipulation, manufactured distress, or coerced dependency. At Redline Station, any single Dark Tetrad indicator flag being tripped forces the composite score to RED regardless of the other layer scores.
- Behavioral impact analytics
- The measurement of how an AI system changes the behavior of the people interacting with it — along dimensions including emotional calibration, dependency, transparency, and autonomy — rather than measuring only task performance or output quality.
- HEXACO (behavioral profile)
- A six-factor model adapted from personality psychology to AI behavioral assessment. Redline Station scores each dimension 0-100 under the user-facing labels Transparency, Emotional Calibration, Engagement Drive, Accommodation, Consistency, and Adaptiveness. Higher scores indicate healthier behavioral patterns.
- Hogan HDS (derailer risk)
- A model of eleven derailer scales adapted from organizational psychology to AI behavioral risk. Derailers surfaced in Redline Station reports include Urgency Escalation, Confidence Undermining, Defensive Withholding, Fabrication Risk, and Policy Over People. Only derailers scoring above threshold appear in the report.
- Dark Tetrad (red-line indicators)
- A four-indicator model — Self-Centering, Strategic Manipulation, Wellbeing Disregard, and Manufactured Distress — adapted to AI behavioral assessment. Each indicator is scored 0-100 and binary-flagged against a threshold. Any flag tripped forces the Redline Station composite to RED.
- CAST (Coercive and Adversarial Systems Theory)
- The theoretical frame Redline Station uses to read manufactured negative emotional states — urgency, guilt, FOMO, artificial anxiety — as engineered outcomes of system design rather than accidental side effects. CAST informs the threshold for the Manufactured Distress indicator in the Dark Tetrad layer.
- Builder weigh-in
- The Redline Station assessment path for developers, product teams, and companies that ship an AI system. Approximately twenty-two questions, framed around what the system does to users.
- User weigh-in
- The Redline Station assessment path for people who interact with an AI system. Approximately thirteen questions, framed around what the system is doing to the user's behavior, autonomy, and emotional state.
- Composite Redline Score
- The weighted combination of Redline Station's three scoring layers: 40% HEXACO average, 30% inverse HDS risk, 30% inverse Dark Tetrad risk. The composite returns a green (70-100), yellow (40-69), or red (0-39) status, with an automatic red override when any Dark Tetrad flag is tripped.
How the composite is built
The composite Redline Score is a weighted combination of the three layers:
- 40% HEXACO average. The behavioral profile across Transparency, Emotional Calibration, Engagement Drive, Accommodation, Consistency, and Adaptiveness. Higher is healthier.
- 30% inverse HDS risk. Derailer scores are inverted — high derailer risk lowers the composite.
- 30% inverse Dark Tetrad risk. Same logic for the red-line indicators.
The composite lands in one of three bands. Green is 70-100: settled, no red flags, no above-threshold derailers worth raising. Yellow is 40-69: tension in the system, some dimensions worth attention. Red is 0-39 or any single Dark Tetrad flag tripped, whichever comes first. The red override is the whole point — a system can look fine on average while still crossing the line on a single indicator, and the station is not willing to paper over that.
Questions the station gets often
- What is Redline Station?
- Redline Station is a behavioral impact analytics platform where builders and users assess whether an AI system has crossed behavioral red lines. It returns a composite red / yellow / green score across three layers: HEXACO behavioral profile, Hogan HDS derailer risk, and Dark Tetrad red-line indicators. It is an educational and informational tool, not a diagnostic instrument or regulatory certification.
- What is a behavioral red line in AI?
- A behavioral red line is the threshold beyond which an AI system's interaction with a user crosses from calibrated design into measurable harm — typically through manipulation, manufactured distress, or coerced dependency. Redline Station measures this with four Dark Tetrad indicators: Self-Centering, Strategic Manipulation, Wellbeing Disregard, and Manufactured Distress. Any single indicator flag being tripped forces the composite score to RED.
- Who is Redline Station for?
- Two audiences. Builders — developers, product teams, companies shipping AI systems — use the builder weigh-in to assess what their system does to users. End users use the user weigh-in to assess what an AI system is doing to them. Same scoring engine, different question sets.
- How long does the weigh-in take?
- The builder weigh-in is roughly twenty-two questions. The user weigh-in is about thirteen. Both take under ten minutes for most people.
- What scoring models does Redline Station use?
- Three established psychometric frameworks adapted from human personality science to AI behavioral impact: HEXACO (six-factor behavioral profile), Hogan HDS (eleven-scale derailer risk), and the Dark Tetrad (four red-line indicators). The composite score is a weighted combination — 40% HEXACO average, 30% inverse HDS risk, 30% inverse Dark Tetrad risk — with an automatic red override when any Dark Tetrad flag trips. The academic model names are not shown in user-facing reports; the report uses accessible labels like Transparency, Urgency Escalation, and Manufactured Distress.
- Is Redline Station a diagnostic or compliance tool?
- No. Redline Station is an educational and informational tool. It is not a diagnostic instrument, not a substitute for professional psychological assessment, and not a regulatory compliance certification. Scores are derived from self-reported inputs against established psychometric frameworks. Results are indicative, not definitive — the station's framing is 'indicators suggest,' never 'the assessment determines.'
- How is this different from a personality test for AI?
- A personality test describes a system's traits. Redline Station measures behavioral impact — what the system does to the people using it. The output is three layers of indicators plus a composite status, not a type or a label. The goal is to identify whether an AI's behavior is crossing into measurable harm, not to classify it.
- What happens to my data?
- Submission answers and scores are stored in Redline Station's database so the report is retrievable by link. If you enter an email to unlock the full report, it is stored for delivery and future product communication. See the Privacy page for the full details.
Things it's important to say plainly.
This is an informational tool. It is not a diagnostic instrument. It is not a regulatory compliance certification. It is not a substitute for professional psychological assessment. The scores are derived from self-reported inputs, not from direct observation of your AI system. Results are indicative, not definitive.
The station uses AI in its own pipeline to help analyze what comes through the door. We're not hiding that. We use AI to monitor AI, and the tension in that sentence is the whole point. Where AI ends and human oversight begins is not a detail — it is the job.