Collaboration with UBC to develop a Python package for context-aware evaluation and generation on unit tests in DS/ML projects using LangChain.
Compatible with multiple service providers (e.g. ChatGPT, Claude, Gemini) and locally hosted LLM agents.
Statistical analyses were conducted to compare different LLMs and ensure evaluations are aligned with human expert judgements.
Awarded MDS Vancouver Audience Choice: Best Talk. Contribution ongoing with pending authorship.
Some of most interesting projects I have worked on previously. Click into each image for more details.