Machine Learning Engineer Intern — Ciroos
- LLM Evaluation Framework: Built an evaluation system benchmarking LLMs on latency, token use, and quality, automating “LLM-as-a-Judge” workflows in Python and Baseten to enable scalable model comparison and reporting.
- Document Intelligence Systems: Engineered document-to-JSON parsing pipelines with Docling, Pydantic, and Claude agents, improving schema consistency and reducing hallucinations.
- Real-Time Visual Analytics Platform: Built a serverless dashboard analytics tool to capture and store authenticated dashboard screenshots, integrating Vision-Language Models to automatically interpret charts and generate actionable insights.