Rag Patel

Software Engineer · Builder · CS @ University of Toronto

I design and ship scalable software—from APIs and data pipelines to polished frontends. Obsessed with clarity, execution, and building things that actually get used.

Second Year CS @ UofT

💼

Prev @ ACTO

⚡

Data-Driven Engineering

🚀

Building, Not Just Studying

Experience

ACTO

May 2025 – Aug 2025 · Toronto, ON

Software Engineering Intern

▸Built model-routing and inference pipelines across 5+ LLMs in a Laravel/PHP backend, improving response times by ~30%
▸Designed evaluation and A/B testing pipelines to detect hallucinations, reducing misclassification from 28% → 22%
▸Implemented Redis caching for conversations and common queries, cutting database calls by ~50%
▸Added automated validators across 12+ API endpoints, reducing manual QA by ~5 hrs/week
▸Integrated Datadog observability (dashboards + alerts), cutting mean time to detect issues by ~50%

PHP (Laravel)RedisPostgreSQLDatadogLLM APIs

Card Marketplace

Sep 2025 – Present · Toronto, ON

Software Engineer

▸Built Spring Boot backend with 20+ REST APIs and optimized PostgreSQL schemas, reducing query times by ~40%
▸Designed async order pipelines with transactional guarantees; load-tested to 500+ concurrent orders without inconsistencies
▸Integrated Stripe payments with webhooks and message queues for async checkout and refunds

Java (Spring Boot)PostgreSQLStripeAsync Processing

K-Man Ventures

May 2024 – Oct 2024 · Remote

Frontend Developer

▸Refactored legacy UI into mobile-first React components, raising Lighthouse mobile score from 62 → 89
▸Reduced initial load time by ~300ms via code-splitting, lazy loading, and memoization
▸Cut layout shift by ~60% through performance-focused frontend optimizations

ReactJavaScriptPerformance Optimization

Featured Projects

SoccerOracle

FastAPIRedisPGVectorLightGBM

ML-Powered Match & Player Analytics

▸Built a multi-output LightGBM model predicting 9+ match stats (shots, possession, cards, etc.)
▸Designed a distributed FastAPI backend with Redis workers, cutting inference latency by ~80%
▸Engineered PCA-based player embeddings stored in PGVector for fast similarity search across 1,000+ players
▸Automated weekly retraining with GitHub Actions + MLflow, reducing prediction error by ~12%

Under Construction

ThreadAI

Node.jsAWSWeaviatePostgreSQL

Event-Driven Knowledge Extraction Platform

▸Architected an event-driven AWS pipeline (S3 → SQS → Lambda → Postgres) for scalable ingestion
▸Deployed Weaviate on ECS Fargate for hybrid vector + BM25 search
▸Built end-to-end processing workflows with validation, status tracking, and automated tests
▸Designed the system for fault tolerance and horizontal scalability

Under Construction

SMART-AIR

Android (Java)FirebaseFirestore

Asthma Management Android App

▸Built a 3-role system (Child / Parent / Provider) with granular privacy and secure data sharing
▸Implemented medication logging, wellness tracking, and automatic PEF zone classification with safety alerts
▸Designed a triage and escalation system with real-time parent notifications for critical events
▸Added adherence tracking, inventory alerts, and exportable PDF/CSV reports for providers

View on GitHub

Technical Deep Dives

Real engineering challenges I've solved and the decisions behind them

LLM Routing & Evaluation Pipeline

Optimizing multi-model inference under real constraints

Problem

Requests were routed across multiple LLMs with inconsistent latency and hallucination risk. Prompt changes were hard to evaluate safely.

Solution

▸Built a model-routing layer across 5+ LLMs based on task and latency
▸Designed an evaluation + A/B testing pipeline to measure hallucinations and prompt quality
▸Added Redis caching for shared context and repeat requests

Tradeoffs

▸More infrastructure and routing complexity
▸Slightly higher infra cost to gain correctness and visibility

Result

~30% faster response times. Misclassification reduced 28% → 22%. Prompt changes became measurable and safe to ship.

LLM SystemsBackendEvaluation

Distributed Caching with Redis

Reducing database load in production APIs

Problem

High read amplification caused latency spikes and heavy database load under traffic.

Solution

▸Implemented Redis caching for conversation history and frequent queries
▸Used TTL-based invalidation to balance freshness and performance
▸Standardized cache usage across API endpoints

Tradeoffs

▸Cache invalidation added complexity
▸Required careful handling of stale reads

Result

~50% reduction in database calls. More stable latency under load. Improved scalability without DB over-provisioning.

RedisPerformanceScalability

Async Order Processing & Data Consistency

Designing safe concurrency in a marketplace backend

Problem

Concurrent orders, payments, and refunds risked race conditions and partial failures.

Solution

▸Built async pipelines with transactional guarantees
▸Integrated Stripe webhooks and message queues
▸Load-tested concurrency scenarios

Tradeoffs

▸Eventual consistency required stricter error handling
▸Higher system complexity than synchronous flows

Result

Sustained 500+ concurrent orders. Zero data inconsistencies under load.

Distributed SystemsConcurrencyTransactions

About Me

I'm a Computer Science student at the University of Toronto focused on building real-world software systems. I care about clean architecture, correctness, and performance, and I prefer shipping things end-to-end over isolated coding exercises. I've worked in startup and intern environments, building backend systems, ML-driven applications, and production features under real constraints. My approach is systems-first: understand the problem, design thoughtfully, then execute cleanly. Right now, I'm focused on becoming a strong software engineer by building, breaking, and refining real products—both in industry and through projects.

Let's Work Together

Currently open to internships and software engineering roles.

📧 rag.patel@mail.utoronto.ca

💻 GitHub

💼 LinkedIn

📄 Resume