Tanay Vaswani

Tanay Vaswani

Software Engineer

Building data systems and managing infrastructure at Confident AI, Inc.

About Me

I’m Tanay Vaswani, a Software Engineer building Systems that handle Data Intensive Application and their Infrastructure.

I’m currently at Confident AI, Inc. (YC W25) working with large language models, Kubernetes, distributed systems, observability, Python, Node.js, and cloud infrastructure.

I enjoy working on platform engineering, designing scalable backend architectures and experimenting with LLMOps tooling and workflows.

Always had a thing for open-source, I used to contribute to commercial & free open-source softwares.

Reach out to me

You can find me most active on Twitter and I’m best reached via Email.

Experience

A brief about my work experiences where I have worked at and where I am currently working at.

Confident AI, Inc. (YC W25)

Current Employer
  • • Migrated online evaluation system to a delegation based asynchronous architecture, enabling independent metric runs per trace on workers.
  • • Improved system throughput by 240% and reduced peak traffic error rate from ~50% to <0.1%, while handling 5x higher load than previous peak.
  • • Rearchitected the code sandbox infrastructure to cloud native serverless functions (AWS Lambda & Azure Functions), improving scalability and isolation.
  • • Built Anthropic integration for the DeepEval SDK.
  • • Developed LangChain, LangSmith & OpenAI integrations for the DeepEval Typescript SDK.
  • • Managing multi region infrastructure on AWS & Azure (Docker, CI/CD, Kubernetes, Terraform IaC).
  • • Built and maintained reusable Terraform modules to provision and manage AWS infrastructure (VPC, EKS, RDS, ALB/NLB, NAT Gateway, S3, Networking, IAM) for multi AZ deployments.
  • • Managed production Kubernetes workloads and configs (Deployments, Services, PV/PVC, Ingress, RBAC, Secrets, ConfigMaps) using ArgoCD with sync waves.
  • • Designed & built asynchronous processing systems for alerts.
  • • Built a centralised streaming gateway for LLM Playground/Arena, Experiments and Evaluations.
  • • Extended LLMs support via Cloud Providers (Amazon Bedrock, Google Vertex AI, Azure OpenAI).
  • • Implemented provider agnostic AI capabilities across the platform for AI summarisation.
  • • Developed a cloud agnostic storage adapter (AWS S3, GCP Buckets, Azure Blob, MinIO).
  • • Built secure content delivery using pre-signed URLs from object storage.
  • • Built a secure code sandbox service for evaluation using code based metrics.
  • Amazon Web Services (AWS)
  • Google Cloud Platform (GCP)
  • Kubernetes
  • Docker
  • Terraform
  • Node.js
  • Python
  • PostgreSQL
  • ClickHouse
  • Redis
  • Object Oriented Design
  • Backend Engineering
  • React.js
  • Tailwind CSS
  • LLMOps

Puch AI

TurboML, Inc.

Quote of the day

"To be insanely hopeful even after all that, you call it madness, I call it strength."

© 2024 Tanay Vaswani. All rights reserved.