Challenge
The Certified Admin exams included over 80 unique short-answer questions across eight different assessments, each requiring manual grading. This process was time-consuming, inconsistent across graders, and difficult to scale.
Ensuring contextual, personalized feedback was also challenging due to vague, overly detailed, or partially correct responses from learners of varying backgrounds.
Solution
AI-Powered Grading Engine
- Trained on course materials, answer keys, and company-specific context.
- Handles vague, nuanced, and over-detailed answers.
- Excludes biased or incorrect historical data from training.
Feedback Generator
- Generates precise, supportive feedback (e.g., "missing concept," "good structure, incomplete logic").
- Encourages learning without pass/fail bias.
Evaluation Pipeline
- Uses blind comparison against human graders to refine model accuracy.
- Implements rubric-based scoring aligned with SME validation.
My Role
- Led project design and development from ideation to pilot testing.
- Authored custom prompt architecture and validation scripts.
- Ran blind testing with SMEs and iterated based on feedback.
- Designed rollout and data governance framework.
Tools & Technologies
- Development: Excel, Word, proprietary LLM
- Integration: Skilljar (theoretical)
- Evaluation: SME blind scoring, feedback loop design
Impact
- 50% reduction in manual grading time
- 95–99% agreement with human graders
- Consistent, high-quality learner feedback delivered across all responses
- Scalable solution for 70+ questions across 7 certification exams
- Strategic roadmap for future rollout and ethical AI expansion