Back to Data Room

Seventh Circuit Thesis Brief

A one-page summary of an empirical legal research project on criminal sentencing appeals, structured LLM extraction, and institutional sources of visible appellate disparity.

Download PDF

Corpus

The project collected 6,011 Seventh Circuit records, identified 1,853 criminal opinion downloads, and deduplicated the final corpus to 1,591 unique criminal sentencing decisions across approximately 3.78 million words.

Extraction and validation

  • Judge-identifying dataset covering panel membership, publication status, posture, issue type, offense, outcome, and relief markers.
  • JSON-schema enforced outputs across five thematic extraction passes.
  • 12 model configurations and 15 schema iterations tested.
  • Manual audit workflow with 95%+ agreement across audited fields for the strongest configuration.

Legal finding

Observed disparity narrowed once publication status, case posture, and the difference between routine nonprecedential dispositions and doctrine-heavy published decisions were separated. The thesis argues that visible disparity is substantially track-centered, not simply judge-centered.

What this demonstrates

Large-corpus legal research, federal appellate domain knowledge, LLM extraction design, validation discipline, and the ability to keep doctrine, procedure, and institutional structure visible inside technical systems.