Danial Mohseni Taheri

Senior Machine Learning Engineer at Walmart Labs.

Danial is currently a senior machine learning engineer at Walmart Labs. He is a member of the Core Search Algo team and works on developing and deploying advance NLP and LLM models to enhance the performance of the search results at Walmart.com. Previously, he was a senior machine learning engineer at J.P. Morgan working on advanced NLP algorithms. His efforts was focused on developing AI applications to extract knowledge and information from large scale financial text data.

Before joining J.P. Morgan, he was a Ph.D. candidate at the University of Illinois at Chicago. His research was focused on developing (Deep) machine learning algorithms for problems, including recommendation systems, user behavior modeling, and data-driven decision-making. He was the recipient of multiple international awards, including the best paper award at CEMA 2020-21 and INFORMS 2020. During his PhD, he was focused on:

Developing (Deep) machine learning algorithms for modeling the users in the applications such as recommendation systems and service system management. He used tools in natural language processing (e.g., Transformers), graph convolution networks, and Bayesian networks to model the time series user data.

Developing reinforcement learning (RL) algorithms for modeling real systems.

PUBLICATIONS

Template-aware Attention Model for Earnings Call Report Generation, NewSum EMNLP Workshop 2021.
KATRec: Knowelge-Aware aTtentive Sequential Recommendation System, (with M. Amjadi, T. Tulabandhula)
Interpretable User Models via Decision-rule Gaussian Processes, NeurIPS 2019 (AABI symposium) (with S. Nadarajah, T. Tulabandhula)

Models of user behavior are critical inputs in many prescriptive settings and can be viewed as decision rules that transform state information available to the user into actions. Gaussian processes (GPs), as well as nonlinear extensions thereof, provide a flexible framework to learn user models in conjunction with approximate Bayesian inference. However, the resulting models may not be interpretable in general. We propose decision-rule GPs (DRGPs) that apply GPs in a transformed space defined by decision rules that have immediate interpretability to practitioners. We illustrate this modeling tool on a real application and show that structural variational inference techniques can be used with DRGPs. We find that DRGPs outperform the direct use of GPs in terms of both out-of-sample performance and the quality of optimized decisions. These performance advantages continue to hold when DRGPs are combined with transfer learning.
Meeting Corporate Renewbale Power Targets, Under review in Management Science (with S. Nadarajah, A. Trivella) [PDF]

In this paper, we design procurement portfolios for companies which have committed to procuring a percentage of future power demand from renewable sources. Computing multi-stage procurement portfolios for these companies results in intractable Markov decision processes. Using methodologies in approximate dynamic programming, we develop a novel and near-optimal heuristic for decision making on realistic instances.
Investment under Limited Long-Term Information, Working Paper (with S. Nadarajah, A. Kelevin, S-E. Feleten )

The limited availability of long-term information about the uncertainties for decision-making problems (i.e., investment in Hydropower plant capacity) increases the potential impact of model misspecification and often lead to suboptimal decisions. To overcome this issue, we propose an algorithm that leverages statistical information in the medium-term and handles the lack of long-term information using robust optimization.
Overbooking in Network of Storage Assets, Technical report (with S. Nadarajah, T. Tulabandhula )

EXPERIENCE

Senior machine learning engineer at Walmart Labs

Develop a graph attention neural network algorithm for large-scale product ranking. Design and build the pipeline for inference in production.
Fine-tune a large language model (LLaMA2) using LoRA to classify the ratings of query-item pairs. Prompt engineering on a large language model (GPT) to generate queries for items without neighbors to improve the performance of GNN.
Develop and train the state-of-the-art information retrieval models such as Bi-encoder, ColBERT, and Cross-encoder.
Deploy CLIP (by OpenAI) to generate a feature based on textual and visual information in the ranking system.

Senior machine learning engineer at J.P.Morgan

Developed a deep learning-based summarization model using Transformers and BERT Siamese Network. Improved the ROUGE score on the Earnings calls dataset by 17%.
Operationalize the summarization algorithm using AWS SageMaker.
Developed a novel algorithm for semantic parsing task (NL2SQL) using BERT and LSTM with significant improvement compared to the baseline model.

Graduate Reaserch Assistant in Artificial Intelligence

Developed a deep learning-based recommender system using a knowledge graph and NLP. Improved the performance in NDCG and Recall by 5%. Deployed in TensorFlow.
Designed and implemented interpretable user models to predict users’ behavior using inference and transfer learning; tested the algorithm on corporate data; increased the accuracy of overbooking decisions by 10%.
Developed a recommendation engine for next-item prediction using Transformers and informative entities in Tensorflow.
Developed and implemented in C++ near-optimal decision-making algorithms to decrease the cost of constructing a portfolio of a financial commodity by 4%.
Proposed a robust reinforcement learning algorithm for decision making under limited long-term information of uncertainties.

Graduate Teaching Assistant in Data science

Machine learning: Deep learning, bayesian inference, reinforcement learning
Foundation of optimization: Linear and integer programming
Programming: Provided tutorials in Pytorch
Projects and leadership: Mentored graduate students in deep learning projects such as object detection and image classification (including CNN and RNN models).

Industry Experience (Research and Data scientist)

Cloudbakers: Created a data pipeline for Github data in Bigquery and developed a clustering method to evaluate the risk of repositories for software development purposes, Fall 2019
Norsk Hydro: Collaborated in an R&D project with a Hydro powerplant company to build an artificial intelligence algorithm for managing long-term investment decisions, Spring 2020

SKILLS AND BACKGROUND

Interests

Machine Learning
Deep Learning
Reinforcement Learning
Applied statistics

Programming

Python (Pandas, Pytorch, Tensorflow, Keras, Scikit-Learn, Scipy)
C++
R, MATLAB
SQL, Spark
Unix, AWS

Education

PhD in Information Sciences (Machine learning), University of Illinois at Chicago (2021)
BSc in Engineering, AmirKabir University, 2015

INVITED TALKS

GNN4Rank: A Graph Neural Network System for Large-scale Product Ranking Walmart Global Tech Conference, Bentonville	July 2023
KATRec: Knowelge-Aware aTtentive Sequential Recommendation System 24th international conference on Discovery Sciences	August 2021
Interpretable User Models via Decision-rule Gaussian Processes NuerIPS, vancouver, Canada	Dec 2019
Meeting Corporate Renewable Power Targets INFORMSAnnual Meeting, Seattle, Washington	OCT 2019
Overbooking in Network of Storage Assets Production and Operations Management Society Annual Conference, Washington D.C. (POMS)	May 2019
Meeting Corporate Renewable Power Targets Production and Operations Management Society Annual Conference (POMS)	May 2019
Investment under Limited Long-Term Information Production and Operations Management Society Annual Conference (POMS)	May 2019
Dual Reoptimization based Approximate Dynamic Programming INFORMS Annual Meeting, Phoenix, Arizona	Nov 2019
Meeting Corporate Renewable Power Targets Production and Operations Management Society Annual Conference, Houston, Texas (POMS)	May 2019
Meeting Corporate Renewable Power Targets Manufacturing & Service Operations Management Annual Conference, Dallas, Texas (MSOM)	Jul 2018

RECENT NEWS: