Hi! I'm Vijay
I build petabyte-scale data platforms, real-time streaming systems, and AI-powered applications. Currently leading the next-generation data platform at TRM Labs.
I'm a software engineer who loves building data and AI systems. But it all started with a video game.
When I was a kid, my computer crashed in the middle of a game. I was so desperate to get back to playing that I spent hours reading articles online, figured out how to reboot and reinstall the entire operating system, all just to play that same video game. But somewhere in that process, I fell in love with computers themselves. How they work, why they break, and what you can make them do.
That curiosity never went away. It led me to study Computer Science at Shiv Nadar University, and from there into a career building software at scale. I've worked on data systems at LinkedIn and Expedia, and now I'm at TRM Labs, where I lead the engineering of a next-generation data platform as a Staff Software Engineer.
Over the years, I've found my sweet spot in data engineering: designing real-time streaming systems, building lakehouse architectures, and solving the kinds of problems that come with operating at petabyte scale. More recently, I've been exploring the world of AI and finding new ways it intersects with the large-scale data systems I build every day.
I also enjoy sharing what I learn. Whether it's speaking at conferences or writing technical deep-dives, I find that explaining ideas to others is the best way to truly understand them.
For the full career details, check out my LinkedIn.
I enjoy sharing what I've learned from building large-scale data systems. Here are some of my recent talks.
Designing petabyte-scale streaming platforms processing up to 1 million events/second for real-time blockchain analytics.
Watch on YouTube →
How TRM Labs migrated from a serverless data warehouse to a modern lakehouse using Apache Iceberg and StarRocks.
Watch on YouTube →
Adopting Apache Iceberg as a cost-effective, high-performance alternative to managed cloud data warehouses.
Watch on YouTube →
Recognized for advancing StarRocks' presence in blockchain analytics, championing innovation, and expanding adoption across the industry through talks and technical writing.
Read more →
Recognized as a Next Gen Trailblazer at TRM Labs for leading the build of the next-generation data platform and pioneering real-time streaming infrastructure.
View on OpenSea →
Featured in TRM Labs' engineering spotlight series for building the company's real-time streaming infrastructure from scratch as the first engineering hire in India.
Read the spotlight →
Heading to Las Vegas for Google Cloud Next, the biggest Google Cloud event of the year.
Event details →
Joining the data engineering community to discuss the latest in data platforms, streaming, and lakehouse architectures.
Event details →I write about the architecture and engineering decisions behind large-scale data systems.
How we built an in-house real-time data processing pipeline to reliably process petabytes of blockchain data, comparing Spark Structured Streaming, Apache Flink, Beam, and Kafka Streams.
Read on Medium → Data PlatformThe journey of migrating TRM Labs' analytics infrastructure from BigQuery to a modern lakehouse architecture.
Read on TRM Labs Blog → Data LakehouseA deep dive into Apache Iceberg's architecture and why it's reshaping how we think about data lakehouse table formats.
Read on Medium → StreamingA comprehensive guide to stream joins: how they work, when to use each type, and patterns for instantaneous data enrichment.
Read on Medium → Real-TimeChallenges and strategies for writing high-throughput streaming data to databases without losing reliability or performance.
Read on Medium → InfrastructureA data platform engineer's journey with Kubernetes, exploring its capabilities across various data engineering projects.
Read on Medium →