Introduces prediction-based Markov Violation Scores (MVS) to detect when observations in RL violate the Markov property. The method leverages prediction errors from learned dynamics models to quantify non-Markovian behavior, enabling more robust policy learning under partial observability.
🎉 Accepted at a Reinforcement Learning conference and will appear in Reinforcement Learning Journal 2026.
arXiv Paper
Proposes Temporal Functional Circuits (TFCs), a framework for extracting faithful, interpretable explanations from Kolmogorov-Arnold Networks (KANs) applied to time-series forecasting. By analyzing learned spline activations, TFCs reveal how individual input features are transformed and combined, offering transparent insight into KAN predictions.
Status: Under review for NeurIPS 2026
arXiv Paper
Scan the QR code above to try our live nutrition estimation service! Text a meal description like "I had a bagel for breakfast" and get instant nutrition analysis. This LLM was trained on the NutriBench dataset and fine-tuned using Reinforcement Learning on the Llama3.1B model. The inference model is hosted on AWS for real-time responses.
GitHub Repository
Sept 2021
N arm bandits is a classical problem in computer science. In this Jupyter note book we will empirically verify that near greedy approch converges to optimal values faster than non greedy or greedy approches and maximizes the expected rewards. Jupyter Notebook
Jun 2021
Causal Structure Discovery is the problem of identifying causal relationships from large quantities of data through computational methods. Solution to this problem can have wide of applications in non empirical scientific studies like climate, biodiversity and health. The current problem is existing methods are computationally not scalable and are data intensive. Jupyter Notebook
Apr 2017
Reinforcement Learning (Q Learning based) agent trained to play Flappy Bird. demo
May 2018
Object detection on raspberry pi. demo
2020
Generative adverserial network with variational auto encoder. details
Jul 2020
Gaussian noise based latent vector to image. details