Hari Vilas Panjwani

AI Engineer at Amazon — Agentic AI, LLMs & AWS AI Infrastructure

Building production-grade AI systems at Amazon — from conversational agents powered by AWS Bedrock to multi-step agentic workflows that automate complex seller operations at scale. Published researcher in deep learning with hands-on expertise in LLM evaluation, RAG architectures, and AI infrastructure on AWS. Passionate about the shift from static ML pipelines to autonomous AI agents that reason, plan, and act.

Agentic AI LLMs AWS Bedrock RAG Conversational AI Python Java AWS

01. Experience

Amazon

AI/ML Engineer • Seattle, WA

Architected Seller Assistant — a production conversational AI agent using RAG + AWS Bedrock (Claude) that autonomously resolves seller claims, eliminating hours of manual review per case
Designed multi-step agentic workflows that reason over seller documents, invoke APIs, and produce structured decisions — reducing claims processing time by 40%
Built and deployed AI infrastructure on AWS: vector stores (OpenSearch), prompt pipelines, LLM evaluation harnesses, and observability for hallucination detection
Migrated high-throughput email notification service handling 250K+ monthly events to event-driven AWS architecture (SNS/SQS/Lambda)
Redesigned seller feedback system (Java, React) increasing positive feedback rate by 30%

Northeastern University

Research Engineer • Boston, MA

Built a large-scale text search engine with custom web crawler, inverted index, and Elasticsearch — processing millions of documents
Implemented distributed PageRank using PySpark on a cluster; explored graph-based relevance algorithms foundational to modern LLM re-ranking

02. Education

Northeastern University

MS Computer Science

GPA: 4.0 • Boston, MA

Vellore Institute of Technology

BS Computer Science

GPA: 9.08/10 • Chennai, India

03. Projects

Multi-Agent Research Assistant

Python, LangChain, Claude API, RAG

Agentic AI system with planning, tool-use, and memory — agents decompose research goals into subtasks and self-correct on failure
RAG pipeline with semantic chunking, hybrid search (BM25 + embeddings), and citation grounding

LLM Evaluation Harness

Python, AWS Bedrock, OpenAI API

Automated framework for evaluating LLM outputs across factuality, faithfulness, and instruction-following using LLM-as-judge
Multi-provider benchmarking across Claude, GPT-4, and Bedrock models with structured reporting

Distributed Key-Value Store

C++

Primary-backup state machine replication with fault tolerance and leader election
Multithreaded load balancer with in-memory cache — demonstrates systems fundamentals behind AI infrastructure

Conversational Search Engine

Python, Elasticsearch, PySpark, BERT

Combined traditional IR (PageRank, BM25) with neural re-ranking using BERT embeddings
Conversational query interface with multi-turn context retention over indexed document corpus

04. Research

Candidate Prioritization using Automated Behavioural Interview

International Conference • Deep Learning & NLP

Early work on AI-driven hiring — automated behavioural scoring using deep learning, a precursor to today's conversational AI evaluation systems

View Paper

Face Recognition With Mask Using MTCNN And FaceNet

IRTAC 2020 • Springer

Real-world neural network adaptation — fine-tuned FaceNet embeddings for occluded face recognition under challenging conditions, demonstrating transfer learning and domain-specific model adaptation

View Paper

05. What I'm Thinking About

Agentic AI Systems

How do you build agents that actually work in production? ReAct, tool-use, memory architectures, and why most agent demos fail at scale.

LLM Infrastructure on AWS

The real cost of running LLMs in production — latency vs. accuracy tradeoffs, prompt caching, Bedrock vs. self-hosted, and observability that actually catches hallucinations.

RAG at Scale

When naive RAG breaks — advanced retrieval (HyDE, multi-query, re-ranking), chunking strategies, and grounding LLMs in real-world enterprise data.

Hari Vilas Panjwani

01. Experience

Amazon

Northeastern University

02. Education

Northeastern University

Vellore Institute of Technology

03. Projects

Multi-Agent Research Assistant

LLM Evaluation Harness

Distributed Key-Value Store

Conversational Search Engine

04. Research

Candidate Prioritization using Automated Behavioural Interview

Face Recognition With Mask Using MTCNN And FaceNet

05. What I'm Thinking About

Agentic AI Systems

LLM Infrastructure on AWS

RAG at Scale

06. Contact