|
Soumya Rani Samineni
I am a second-year Ph.D. student at Arizona State University with 3+ years of professional experience and over 2 years of research experience in
Artificial Intelligence (AI), Reinforcement Learning (RL), Large Language Models (LLMs), Machine Learning (ML),
optimization, and robotics.
I worked as an ML Research Engineer at Quantiphi, Bangalore, where I applied reinforcement learning to workforce optimization.
I was a Research Fellow at Microsoft Research India, focusing on reinforcement learning algorithms for energy grids. I also contributed as an
AI Engineer at AI Labs, Hyderabad, developing a quadrupedal controller inspired by MIT Cheetah’s impedance control and building object-detection models.
I completed my Master’s in Computer Science and Engineering at the
Department of Computer Science and Automation, IISc Bangalore, advised by
Prof. Shishir Kolathaya and
Prof. Shalabh Bhatnagar.
My master’s thesis, Policy Search using Dynamic Mirror Descent for Off-policy Reinforcement Learning, was funded by the
Robert Bosch Center for Cyber-Physical Systems (RBCCPS).
During my time in the Stochastic Systems Lab and the
Stochastic Robotics Lab, I studied reinforcement learning for robotics and stochastic approximation.
Prior to IISc, I served as an Assistant Executive Engineer (Civil) for the Government of Telangana and hold a Bachelor's degree in Civil Engineering from
National Institute of Technology, Warangal (NITW).
Email  / 
Resume  / 
Google Scholar  / 
Twitter  / 
Github
|
|
|
Research
I’m interested in Artificial Intelligence, with a focus on reinforcement learning, large language models, machine learning, optimization, and deep learning. My research spans reinforcement learning–based post-training methods for large language models, as well as studying their reasoning and planning limitations.
|
|
Publications
|
1. RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs
Soumya Rani Samineni, Durgesh Kalwar, Karthik Valmeekam, Kaya Stechly, Subbarao Kambhampati
NeurIPS 2025, LAW: Bridging Language, Agent, and World Models for Reasoning and Planning Workshop, 2025
|
2. Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains
Soumya Rani Samineni, Durgesh Kalwar, Vivek Gangal, Siddhant Bhambri, Subbarao Kambhampati
NeurIPS 2025, 5th Workshop on Mathematical Reasoning and AI, 2025
|
3. Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!
Subbarao Kambhampati, Kaya Stechly, Karthik Valmeekam, Levi Saldyt, Siddhant Bhambri, Vatsal Palod, Anuj Gundawar, Soumya Rani Samineni, Durgesh Kalwar, Uttaran Biswas
NeurIPS 2025, Workshop on CogInterp: Interpreting Cognition in Deep Learning Models, 2025
|
4. System and Method for Intelligent Scheduling of Manufacturing Jobs
Dagnachew Birru, Anirudh Deodhar, Achint Chaudhary, Soumya Rani Samineni
US Patent Application US20240319718A1, 2024
|
5. Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning
Soumya R Samineni*,Utkarsh Mishra*, P Goel, C Kunjeti, H Lodha, A Singh, A Sagi,
Shalabh Bhatnagar , Shishir Kolathaya
(*equal contribution)
International Conference on Robotics and Automation (ICRA), 2022  
NIPS Deep RL Workshop, 2021   (Poster)
NIPS Offline RL Workshop, 2021   (Poster)
project page
/
arXiv
/
video
|
6. Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL
Soumya R Samineni*,
Masters Thesis, 2021  
arXiv
/
video
/
code
|
|