Prasanna Sattigeri

Logo

Principal Research Scientist at IBM Research AI and MIT-IBM Watson AI Lab, focusing on reliable AI, LLM governance, uncertainty quantification, and trustworthy machine learning.

View My GitHub Profile

Curriculum Vitae

Back to Home


Professional Experience

Principal Research Scientist

IBM Research AI, MIT-IBM Watson AI Lab | December 2015 - Present

Key Roles:

Responsibilities:

Selected Projects:


Software Engineer

Yelp Inc., San Francisco, CA | December 2014 - December 2015


Research Assistant

School of ECEE, Arizona State University | January 2009 - December 2014


Teaching Assistant

School of ECEE, Arizona State University | August 2011 - December 2013


Education

Ph.D. in Electrical Engineering | December 2014

Bachelor of Technology in Electronics Engineering | April 2008


Honors and Awards

Year Award
2025 Granite Guardian #1 on GuardBench (86% accuracy across 40 datasets)
2025 Granite Guardian #1 on REVEAL benchmark (outperforms GPT-4o)
2025 Granite Guardian #3 on LLM-AggreFact benchmark
2021 IBM Research Technical Achievement Award - Science of Uncertainty Quantification
2021 IBM Research Technical Achievement Award - Science of Accurate, Robust, and Generalizable AI
2020 IBM Research Technical Achievement Award - Science in Learning with Less Labels (LwLL)
2020 IBM Research Technical Achievement Award - Dynamic Neural Networks for Efficient AI
2020 Harvard Belfer Center Tech Spotlight Runner-up for AI Fairness 360
2019 IBM Outstanding Technical Achievement Award - Contributions to Trustworthy AI
2019 Best Paper at KDD Applied Data Science for Healthcare Workshop
2014 University Graduate Fellowship, Arizona State University

Professional Service

Leadership Roles:

Workshop Organization:

Review Service:


Invited Talks, Tutorials and Panels

Year Event Topic
2024 MIT AI Conference AI Ethics and Change Management
2024 NAACL TrustNLP Workshop LLM Governance and Alignment
2024 National Academy of Sciences Reliable AI-assisted Decision Making
2024 CHI TREW Workshop Trust and Reliance in Evolving Human-AI Workflows
2023 KDD Workshop Uncertainty Calibration and AI-assisted Decision Making
2023 DSHealth Workshop, KDD Generative AI and Safety
2023 AI for Open Society Day, KDD Trustworthy LLMs (Panel)
2021 KDD Responsible AI Workshop Trustworthy AI Toolkits
2021 PyData Global AI Uncertainty Quantification Tutorial
2021 ACM CODS-COMAD UQ360 Hands-on Tutorial
2021 ARC Training Centre, Australia AI Uncertainty Quantification
2019 AI Research Week, MIT Trusted AI (Talk and Panel)
2019 Harvard ComputeFest AI Fairness (Talk and Tutorial)
2018 AAAI Fall Symposium AI and Trust (Talk and Panel)

Selected Press Coverage


Selected Publications

2025

  1. Paes, L.M., Wei, D., Do, H.J., Strobelt, H., Luss, R., Dhurandhar, A., Nagireddy, M., Ramamurthy, K.N., Sattigeri, P., Geyer, W., Ghosh, S. “Multi-Level Explanations for Generative Language Models.” ACL, 2025.
  2. Miehling, E., Desmond, M., Ramamurthy, K.N., Daly, E.M., Varshney, K.R., et al., Sattigeri, P. “Evaluating the Prompt Steerability of Large Language Models.” NAACL, 2025.
  3. Padhi, I., Nagireddy, M., Cornacchia, G., et al., Sattigeri, P. “Granite Guardian: Comprehensive LLM Safeguarding.” NAACL Industry Track, 2025.
  4. Huang, Y., Hua, H., Zhou, Y., et al., Sattigeri, P., Zhang, X. “Building a Foundational Guardrail for General Agentic Systems via Synthetic Data.” arXiv:2510.09781, 2025.
  5. 66 co-authors incl. Sattigeri, P. “On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective.” arXiv:2502.14296, 2025.
  6. Richards, J.T., Dhurandhar, A., Daly, E.M., Hind, M., Sattigeri, P., Wei, D., et al. “Agentic AI Needs a Systems Theory.” arXiv:2503.00237, 2025.

2024

  1. Shen, M., Ryu, J.J., Ghosh, S., Bu, Y., Sattigeri, P., Das, S., Wornell, G.W. “Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?” NeurIPS, 2024.
  2. Hou, Y., Pascale, A., Carnerero-Cano, J., Tchrakian, T., Marinescu, R., Daly, E., Padhi, I., Sattigeri, P. “WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia.” NeurIPS Datasets and Benchmarks, 2024.
  3. Rawat, A., Schoepf, S., Zizzo, G., Cornacchia, G., et al., Sattigeri, P., Chen, P.Y., Varshney, K.R. “Attack Atlas: A Practitioner’s Perspective on Challenges and Pitfalls in Red Teaming GenAI.” NeurIPS, 2024.
  4. Shen, M., Das, S., Greenewald, K.H., Sattigeri, P., Wornell, G.W., Ghosh, S. “Thermometer: Towards Universal Calibration for Large Language Models.” ICML, 2024.
  5. Miehling, E., Nagireddy, M., Sattigeri, P., Daly, E.M., Piorkowski, D., Richards, J.T. “Language Models in Dialogue: Conversational Maxims for Human-AI Interactions.” EMNLP Findings, 2024.
  6. Padhi, I., Ramamurthy, K.N., Sattigeri, P., Nagireddy, M., Dognin, P., Varshney, K.R. “Value Alignment from Unstructured Text.” EMNLP Industry Track, 2024.
  7. Pedapati, T., Dhurandhar, A., Ghosh, S., Dan, S., Sattigeri, P. “Large Language Model Confidence Estimation via Black-Box Access.” arXiv:2406.04370, 2024.
  8. Paes, L.M., Wei, D., Do, H.J., Strobelt, H., Luss, R., Dhurandhar, A., Nagireddy, M., Ramamurthy, K.N., Sattigeri, P., Geyer, W., Ghosh, S. “Multi-Level Explanations for Generative Language Models.” arXiv:2403.14459, 2024.
  9. Nagireddy, M., Padhi, I., Ghosh, S., Sattigeri, P. “When in Doubt, Cascade: Towards Building Efficient and Capable Guardrails.” arXiv:2407.06323, 2024.
  10. Jiang, M., Ruan, Y., Sattigeri, P., Roukos, S., Hashimoto, T. “Graph-based Uncertainty Metrics for Long-form Language Model Outputs.” arXiv:2410.20783, 2024.
  11. Achintalwar, S., Baldini, I., Bouneffouf, D., et al., Sattigeri, P., et al., Varshney, K.R. “Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations.” arXiv:2403.09704, 2024.

2023

  1. Basu, S., Katdare, P., Sattigeri, P., Chenthamarakshan, V., Driggs-Campbell, K., Das, P., Varshney, L.R. “Efficient Equivariant Transfer Learning from Pretrained Models.” NeurIPS, 2023.
  2. Mozannar, H., Lee, J.J., Wei, D., Sattigeri, P., Das, S., Sontag, D. “Effective Human-AI Teams via Learned Natural Language Rules and Onboarding.” NeurIPS, 2023.
  3. Shen, M., Bu, Y., Sattigeri, P., Ghosh, S., Das, S., Wornell, G.W. “Post-hoc Uncertainty Learning Using a Dirichlet Meta-Model.” AAAI, 2023.
  4. Shen, M., Ghosh, S., Sattigeri, P., Das, S., Bu, Y., Wornell, G.W. “Reliable Gradient-free and Likelihood-free Prompt Tuning.” EACL Findings, 2023.
  5. Shah, A., Shen, M., Ryu, J.J., Das, S., Sattigeri, P., Bu, Y., Wornell, G.W. “Group Fairness with Uncertainty in Sensitive Attributes.” arXiv:2302.08077, 2023.

2022

  1. Shah, A., Bu, Y., Lee, J., Sattigeri, P., et al. “Selective Regression Under Fairness Criteria.” ICML, 2022.
  2. Lee, J., Bu, Y., Sattigeri, P., et al. “A maximal correlation approach to imposing fairness in machine learning.” ICASSP, 2022.
  3. Varici, B., Shanmugam, K., Sattigeri, P., Tajer, A. “Intervention Target Estimation in the Presence of Latent Variables.” UAI, 2022.
  4. Lee, J., Bu, Y., Sattigeri, P., et al. “A Maximal Correlation Framework for Fair Machine Learning.” Entropy 24(4): 461, 2022.
  5. Ghosh, S., Liao, Q.V., Ramamurthy, K.N., Navratil, J., Sattigeri, P., Varshney, K., Zhang, Y. “Uncertainty Quantification 360: A Hands-on Tutorial.” ACM CODS-COMAD, 2022.

2021

  1. Varici, B., Shanmugam, K., Sattigeri, P., Tajer, A. “Scalable Intervention Target Estimation in Linear Models.” NeurIPS 34: 1494-1505, 2021.
  2. Ahuja, K., Sattigeri, P., et al. “Conditionally independent data generation.” UAI, pp. 2050-2060, 2021.
  3. Luss, R., Chen, P.Y., Dhurandhar, A., Sattigeri, P., et al. “Leveraging latent features for local explanations.” KDD, pp. 1139-1149, 2021.
  4. Bhatt, U., et al. “Uncertainty as a form of transparency: Measuring, communicating, and using uncertainty.” AAAI/ACM AIES, pp. 401-413, 2021.
  5. Lee, J.K., Bu, Y., Rajan, D., Sattigeri, P., et al. “Fair Selective Classification via Sufficiency.” ICML, pp. 6076-6086, 2021.
  6. Meng, Y., Panda, R., Lin, C.C., Sattigeri, P., et al. “AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition.” ICLR, 2021.
  7. Galhotra, S., Shanmugam, K., Sattigeri, P., Varshney, K.R. “Interventional Fairness with Indirect Knowledge of Unobserved Protected Attributes.” Entropy 23(12): 1571, 2021.

2020

  1. Kinyanjui, N.M., et al. “Fairness of classifiers across skin tones in dermatology.” MICCAI, pp. 320-329, 2020.
  2. Tatro, N., Chen, P.Y., Das, P., Melnyk, I., Sattigeri, P., Lai, R. “Optimizing mode connectivity via neuron alignment.” NeurIPS 33: 15300-15311, 2020.
  3. Thiagarajan, J.J., Venkatesh, B., Sattigeri, P., Bremer, P.T. “Building calibrated deep models via uncertainty matching with auxiliary interval predictors.” AAAI, 2020.
  4. Meng, Y., Lin, C.C., Panda, R., Sattigeri, P., et al. “AR-Net: Adaptive Frame Resolution for Efficient Action Recognition.” ECCV, pp. 86-104, 2020.

2019

  1. Lee, J., Sattigeri, P., Wornell, G. “Learning New Tricks From Old Dogs: Multi-Source Transfer Learning From Pre-Trained Networks.” NeurIPS, 2019.
  2. Thiagarajan, J.J., Rajan, D., Sattigeri, P. “Understanding Behavior of Clinical Models under Domain Shifts.” KDD Healthcare Workshop, 2019. (Best Paper)
  3. Bellamy, R.K.E., et al. “AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias.” IBM Journal of Research and Development, 2019.
  4. Sattigeri, P., Hoffman, S.C., Chenthamarakshan, V., Varshney, K.R. “Fairness GAN: Generating Datasets with Fairness Properties.” IBM Journal of Research and Development, 2019.

2018

  1. Kumar, A., Sattigeri, P., et al. “Co-regularized alignment for unsupervised domain adaptation.” NeurIPS, pp. 9345-9356, 2018.
  2. Kumar, A., Sattigeri, P., Balakrishnan, A. “Variational inference of disentangled latent concepts from unlabeled observations.” ICLR, 2018.

2017

  1. Sattigeri, P., Kumar, A., Fletcher, T. “Semi-supervised learning with GANs: manifold invariance with improved inference.” NeurIPS, pp. 5534-5544, 2017.

Patents

  1. Luss, R., Chen, P.Y., Dhurandhar, A., Sattigeri, P., Shanmugam, K. “Contrastive explanations for images with monotonic attribute functions.” U.S. Patent 11,222,242 (January 2022)
  2. Ramamurthy, K., Thiagarajan, J., Sattigeri, P., Spanias, A. “Ensemble sparse models for image analysis and restoration.” U.S. Patent 9,875,428 (January 2018)
  3. Chen, P.Y., Das, P., Melnyk, I., Sattigeri, P., Lai, R., Tatro, N. “Efficient search of robust accurate neural networks.” U.S. Patent Application 16/926,407 (January 2022)
  4. Lee, J.K., Sattigeri, P., Wornell, G. “Multi-source transfer learning from pre-trained networks.” U.S. Patent Application 16/843,173 (October 2021)

Professional Membership

IEEE Senior Member