AI Engine

AI That Evaluates Work, Not Answers

ProveIQ's evaluation engine scores real work samples against employer rubrics — delivering consistent, explainable, bias-resistant results at scale.

Quiz Platforms vs. ProveIQ AI

The difference isn't just technical — it's philosophical. We evaluate capability, not test-taking skill.

Quiz Platforms

Recall of memorised answers
Binary right/wrong matching
No feedback or a final percentage
Question phrasing bias, test-taking tricks
Low — tests knowledge not performance

ProveIQ AI

Applied reasoning on real tasks
Multi-dimensional rubric scoring
Detailed, dimension-level guidance
Blind, criteria-anchored evaluation
High — mirrors actual job tasks

The 5-Step AI Workflow

From rubric ingestion to score aggregation — every step is transparent, auditable, and employer-configurable.

STEP1

Rubric Ingestion

ProveIQ ingests the employer-defined rubric, extracting scoring dimensions, weights, and quality benchmarks. The model is anchored to these criteria before a single submission is read.

STEP2

Submission Analysis

The candidate's work is parsed end-to-end — structure, argumentation, evidence, and methodology — without any reference to the candidate's identity or background.

STEP3

Multi-Dimensional Scoring

Each dimension (Accuracy, Clarity, Completeness, Creativity, Process) is scored independently on a 0–5 scale, then combined using the rubric weights into a final overall score.

STEP4

Feedback Generation

For every dimension, the AI writes specific, evidence-backed feedback citing passages from the submission. Candidates receive actionable guidance, not generic commentary.

STEP5

Score Aggregation

Final scores are stored, auditable, and exportable. Employers can view cohort distributions, flag outliers, and apply overrides — all with a full audit trail.

You're hired based on proof, not promises

Score Dimensions & Default Weights

Our default rubric balances five dimensions. Employers can customise every weight to fit their role requirements.

Accuracy30%

Clarity20%

Completeness20%

Creativity15%

Process15%

Score Interpretation Scale

All scores are on a 0–5 scale. Here's what each band means in practice.

Exceptional4.5 – 5.0

Strong3.5 – 4.4

Adequate2.5 – 3.4

Developing1.5 – 2.4

Beginning0 – 1.4

Sample Evaluation Result

This is what a candidate and employer both see after an assessment is processed — transparent, detailed, and actionable.

ProveIQ Score

Jordan Kim

Verified

0.0/ 5.0

Accuracy88%

Clarity75%

Completeness92%

Creativity68%

Process82%

AI-evaluated · Real-world tasks

Employer Override Controls

AI does the heavy lifting — but humans stay in control. Every output is adjustable before it reaches candidates.

Adjust Weights

Shift dimension weights to align with your specific role requirements. Emphasise Process for junior hires or Creativity for innovation roles — the AI re-scores accordingly.

Edit Feedback

Reviewers can annotate or rewrite AI-generated feedback before it reaches candidates, ensuring tone and content match your employer brand.

Override Scores

Every AI score can be reviewed and overridden by a qualified human reviewer. Overrides are logged to the audit trail with a mandatory reason field.

Built for Fairness

Six structural commitments that ensure every candidate is evaluated on merit alone.

Rubric-Anchored

Every score is derived directly from employer-defined criteria — not from the AI's priors or generalised quality heuristics.

Blind Evaluation

Candidate names, photos, institutions, and demographic signals are stripped before evaluation begins, eliminating identity-based bias.

Consistency Audits

Identical submissions produce identical scores. We run weekly consistency checks and surface any score drift above a tolerance threshold.

Explainability

Every score includes a traceable explanation referencing specific sections of the candidate's work — no opaque model outputs.

Human-in-Loop

Borderline scores (within 0.3 of a band boundary) are automatically flagged for optional human review before being finalised.

Continuous Improvement

Scoring models are retrained quarterly on verified human-reviewed samples, with regression tests run before every production deployment.

AI FAQ

Common Questions

Straight answers to the questions we hear most from employers and candidates.

The model is constrained to score only what is present in the submission, using the rubric as the sole evaluation frame. It cannot infer intent or award credit for absent content. All outputs are validated against a scoring schema before being stored.

Yes. ProveIQ accepts text documents, spreadsheets, code files, slide decks, and structured data. The parser normalises each format before evaluation, ensuring consistent scoring regardless of how the candidate chose to present their work.

Low-confidence scores (where the model's internal confidence falls below a threshold) are automatically escalated for human review. The candidate's score is held until a reviewer confirms or overrides the AI's assessment.

When an employer creates a new rubric, our platform runs a calibration check using synthetic submissions across the score spectrum. This surfaces logical inconsistencies in the weighting before any real candidate work is evaluated.

Candidate submission data is never used for model training without explicit, opt-in consent from both the employer and the candidate. Our default training pipeline uses only synthetic and consent-confirmed human-reviewed samples.

ProveIQ supports evaluation in English, Spanish, French, German, and Portuguese. Submissions in other languages are accepted but routed to human reviewers. We are expanding AI language coverage quarterly.

See the AI in action

Try a free sample assessment and receive your own AI-generated score card in under five minutes.

Try a Free Assessment Post an Assessment