About Me

Shipping ML research
to production.

Applied ML researcher and engineer with 5+ years closing the loop from experiment to production — multimodal learning, post-training LLMs (GRPO, SFT, QLoRA), large-scale retrieval & ranking, and document-understanding systems at scale. 2 US patents filed. Published in IEEE, Semantic Web Journal, and Alexa Prize. Strong collaborations with product, stakeholder engineering teams and leadership.

             LLM Post-training: GRPO · SFT · QLoRA
             Multimodal: ViT-DeBERTa · LayoutLM
             Retrieval at 27M+ entities
             2 US Patents Filed · 2025
             MS USC · GPA 4.0/4.0
          

At BILL, I've shipped production systems spanning a 27M-entity multimodal search system (patent pending), various deep learning and traditional ML solutions and end-to-end LLM agent orchestration for customer support and invoice understanding.

Before this, I spent two years as a Research Assistant at USC's Information Sciences Institute—working on multi-agent RL for the Diplomacy game (Dr. Jon May) and knowledge graph quality & embeddings (Dr. Filip Ilievski). I completed my M.S. in Computer Science with Honors and a 4.0 GPA at University of Southern California and my B.Tech with Rank 1 from K. J. Somaiya College of Engineering, Mumbai.

Outside work: Guitar, snowboarding, hiking and painting. Find my resume here.

Focus Areas

🤖 LLM Post-Training 👁️ Multimodal Learning 💬 NLP & Agents 🔍 Retrieval & Ranking 📄 Document AI

Education

MS Computer Science (Honors), Dec 2022

GPA: 4.0 / 4.0

University of Southern California

BTech Computer Engineering, Jun 2018

GPA: 9.56 / 10 · Rank 1

University of Mumbai, India

Built & Shipped

Selected Projects

For full list see my resume

RAG LLM NLP

RAG Health Assistant · USC

End-to-end RAG pipeline (ChromaDB, LangChain, FastAPI, OpenAI APIs, PHI guardrails via Presidio Analyzer) to promote HPV vaccination to Los Angeles General Hospital patients as a pilot. Presented as a poster at USC ShowCAIS 2026. Demo · GitHub

USC ShowCAIS 2026 Presentation

Dialogue Generate & Rank

Viola — Alexa Prize Socialbot Grand Challenge 4

USC team advanced as a semifinalist in Amazon's Alexa Prize Socialbot Grand Challenge 4. Contributed to few components of the generate-and-rank pipeline: DialoGPT generation, SNIPS NLU intent classification, FSM dialogue management, BERT re-ranking, and AI safety guardrails. Article

Alexa Prize Semifinalist · SGC4

RAG Open-Domain QA

Retrieval-Augmented QA (Natural Questions)

Fine-tuned BART-large for dense retrieval + answer generation, achieving 92.6% top-1k retrieval accuracy and 43.9% EM accuracy on the Natural Questions dataset.

92.6% top-1k · 43.9% EM

Computer Vision Siamese Net

VeriSign — Signature Verification

Siamese network (CNN + Pooling + BatchNorm + Dropout) for one-shot signature verification— determining whether two signatures belong to the same person. GitHub →

Medical Imaging CNN

Chest X-Ray Pneumonia Detection

CNN trained on chest X-rays to detect pneumonia. Achieved 94.56% test accuracy and a recall of 0.97.

94.56% acc · 0.97 recall

Computer Vision IEEE Published

Silatra — Sign Language Translation

Android app translating Indian Sign Language gestures into voice using image processing, segmentation, KNN & HMM algorithms. Published at IEEE ICCNT 2018.

Published · IEEE ICCNT

Career

Work Experience

Building and shipping ML systems from research prototype to production at scale.

BILL · San Jose, CA

Feb 2023 – Present

Senior Machine Learning Engineer

Achieved F1 = 0.90 on document-invoice boundary detection (+50% over production LSTM baseline) by post-training Qwen 2.5 (1.5B–7B) with GRPO + QLoRA; conducted systematic ablations across reward policy design, LoRA rank, and SFT baseline (F1 = 0.82) to validate the RL approach.
Spearheaded and shipped a production entity search system over 27M invoice embeddings, achieving 91% top-1 accuracy at 280ms P95 latency via a pointwise Learning-to-Rank pipeline with XGBoost Ranker + OpenSearch ANN. ⚡ Patent filed 1/31/2025
Shipped production LLM agent orchestration: retrieval, tool-selection, context engineering, prompt engineering and grounding mechanisms using LangChain, N8N, and evaluation + monitoring via Braintrust & DataDog for customer support, IVR payment delivery, and invoice understanding agents.
Engineered multi-task training document understanding achieving 89% classification & 76% field extraction accuracy by jointly training a Siamese classification head + NER heads on a shared LayoutLM (2D attention transformer) encoder.
Designed logo extraction pipeline achieving 0.90 mAP (YOLOv8) and 87% similarity accuracy (ConvNeXt) to automate entity deduplication across millions of invoice logos.
Owned end-to-end retrieval pipeline design (query/doc representation, negative sampling, A/B experimentation, MRR offline evaluation). Mentored multiple engineers.

Information Sciences Institute, USC

Jan 2022 – Dec 2022

Research Assistant — May Team (Dr. Jon May)

Improved multi-agent RL Diplomacy bot win-rate by 5% by engineering natural-language DAIDE message generation and rule-based strategic reasoning on top of targeted changes to the reinforcement learning reward modeling policy in DipNet.

Centre of Knowledge Graphs, USC

Feb 2021 – Jan 2022

Research Assistant — Dr. Filip Ilievski

Boosted graph embedding quality by +10.6% Spearman correlation (0.66 → 0.73 on WordSim353) by retrofitting node representations with BERT embeddings + structural features from Wikidata, Probase, DBPedia. Published in Semantic Web Journal.
Built automated quality-assessment pipeline over 1.1B Wikidata statements, identifying low-quality links via deleted, deprecated, and constraint-violation signals.

Barclays Global Service Centre · Pune, India

Jul 2018 – Dec 2020

Software Developer, Barclaycard UK

Devised a real-time fraud detection system on streaming transactions at 20ms avg latency (ROC-AUC 0.7) using ensemble models over Kafka + PySpark, reducing false-positive escalations.
Worked on Latent Dirichlet Allocation (unsupervised clustering) to extract insights from iOS and Android application reviews and customer complaints.
Implemented dashboards for automated generation of real-time delivery metrics of more than 30 teams from Agile Central and Jira data sources which have been saving around 150 man-hours annually. Bagged the Barclays Award of Stewardship for this initiative.

Research

Publications & Patents

Peer-reviewed papers, workshop proceedings, and intellectual property.

⚡

2 US Patents Filed · 2025

Layout-aware field extraction (US 19/410,970) — System, method, and computer program product for extracting fields from vendor documents using layout detection and user behavior patterns.
Multimodal embedding system for business recommendations (US 19/043,191) — multimodal embeddings for entity search at scale.

Journal of Web Semantics · Vol. 72 · 2022 · 146 citations

A Study of the Quality of Wikidata

First-authored empirical audit of Wikidata's data quality across completeness, consistency, and schema conformance at scale. Developed automated assessment pipelines and surfaced systematic gaps — providing the community with actionable quality metrics and a reproducible framework adopted by downstream KG research.

Kartik Shenoy, Filip Ilievski, Daniel Garijo, Daniel Schwabe, Pedro Szekely

Code PDF

IEEE ICCCNT · 9th International Conference · Oct 2018

Real-time Indian Sign Language (ISL) Recognition

First-authored undergraduate capstone turned IEEE publication. Designed and built an Android application using image processing, colour-based hand segmentation, Hidden Markov Models and KNN classifiers to translate Indian Sign Language gestures into synthesised voice output for hearing- and speech-impaired users in real time.

Kartik Shenoy, Tejas Dastane, Varun Rao, Devendra Vyavaharkar

Code PDF

Semantic Web · Vol. 15(3), pp. 877–896 · 2024

A Study of Concept Similarity in Wikidata

Contributed the graph-embedding retrofitting pipeline — injecting BERT embeddings and structural features from Wikidata, Probase, and DBPedia into node representations — improving Spearman correlation from 0.66 → 0.73 on WordSim353. The study showed that pairing language models with rich structural knowledge achieves best-in-class concept similarity performance.

Filip Ilievski, Kartik Shenoy, Hans Chalupsky, Nicholas Klein, Pedro Szekely

PDF

Iberoamerican Conf. on Knowledge Graphs & Semantic Web · 2022

Does Wikidata Support Analogical Reasoning?

Ran evaluation experiments across multiple Wikidata relation categories using text embeddings to test whether the knowledge graph's relational structure supports analogical reasoning. Found that relevant analogical information is frequently absent or inconsistently modelled — establishing desiderata for future automated analogy extraction.

Filip Ilievski, Jay Pujara, Kartik Shenoy

PDF

Amazon Alexa Prize · Socialbot Grand Challenge 4 · 2021

Viola: A Topic Agnostic Generate-and-Rank Dialogue System

USC's entry in Amazon's Alexa Prize Socialbot Grand Challenge 4, advancing as a semifinalist from a competitive field of university teams. Contributed to few components of the generate-and-rank pipeline: DialoGPT for candidate generation, SNIPS NLU intent classification, FSM dialogue management, BERT-based response re-ranking, and AI safety guardrails. (3rd author, large team)

Hyundong Cho, Basel Shbita, Kartik Shenoy, Shuai Liu, Nikhil Patel, et al.

YouTube PDF

Shipping ML researchto production.