Major projects

AI Platform

Deliverable: Features for an in-house AI Platform to enable game titles such as FIFA, NFS, Madden, etc., experiment, train, tune, and serve their ML models.

Contributions:

  • Designed and built features such as batch materialization engine, multi-tenancy support for the feature store service used for ingesting data in batches from the offline store to the online store and setup independent feature store servers for tenants such as DnA, Battlefield 2042.
  • Added a new feature to the AI Platform's JupyterHub service to support custom docker environments built by data scientists from different game titles.
  • Updated a custom model training service to support the above-mentioned feature when data scientists train their ML models using the service.
  • Worked on adding RStudio server support to the above JupyterHub service to enable data analysts from teams such as FIFA, NHL, and Madden to work on their existing R projects.

Outcome: Eliminated the need for duplicate ETL pipelines and data storage, reducing the development and operational cost by 50% with the feature store. Mitigated the cost increase for the game titles by up to 23% for training  by using the in-house AI Platform instead of AWS SageMaker.

Contextual Assistant with Smart Recommendations

Deliverable: An NLP-based contextual chatbot to support the internal support requests of a US banking firm.

Contributions:

  • Prepared NLU and dialogue data for intent classification and dialogue models with entities, synonyms, story checkpoints, and other aspects of conversational data.
  • Trained and tuned the models to achieve accuracy over 95% on the intent classification of the utterances.
  • Integrated the bot with Microsoft Teams, an internal communication channel used by the firm.
  • Developed backend python services for ServiceNow and Salesforce integrations to create and manage different kinds of issues.
  • Contributed to the development of a model that provides KB article recommendations based on user queries.

Outcome: Automated the firm’s support issue creation and reduced the budgeted cost of manpower by 20%.

Multiple Conversational AI Assistants

Deliverables: Multiple smart NLP-based assistants to support mission-critical IT operations of a large US retail firm with customers across the globe.

Contributions:

  • Worked with the client team in building the chatbots' dialogue flows by acquiring knowledge from the domain experts.
  • Developed a backend module for an application called Testing and Operations Manager (TOM) that evaluates and analyses the overall performance of the chatbots with different metrics.
  • Improved the efficiency of the application by avoiding recurrent data access from disk  and loading the data only once into the in-memory cache using Redis during server startup.
  • Containerized all the chatbots using Docker and deployed them in Azure Kubernetes Service(AKS) using a CI/CD pipeline in Azure DevOps. Also, trained, tuned, and monitored the functioning of the NLP pipelines of the bots.
  • Developed python scripts to securely store and fetch bots' credentials from Azure Key Vault.

Outcome: Reduced the deployment time and budgeted workforce cost by 70%.

Customer Email Category Classification

Deliverable: A multi-class predictive classifier to categorize the customer emails of a major US financial services corporation.

Contributions:

  • Performed Exploratory Data Analysis to know more about the features and to check the feasibility of solving the problem at hand with the given dataset.
  • Built a data pipeline to clean, transform, and derive additional features to better represent the data.
  • Developed and tuned an LSTM RNN using Tensorflow and Keras achieving an accuracy of over 85% on the classification task.
  • Worked collaboratively with the on-site US counterparts and presented the insights achieved to the on-site architect.

Outcome: Reduced the turnaround time of the customer support team by 20% during the pandemic since the number of online transactions increased drastically as the corporation held a market share of 25%(approx).

StrapvizPy and StrapvizR

GitHub: Python | R

Deliverables: Simple Python and R packages to bootstrap, visualize and tabulate statistics.

Contributions:

  • Developed packages to streamline the process of bootstrapping samples, creating insightful plots and tables with statistics such as confidence interval, standard errors.
  • Added support for exporting tables as latex (.tex) files for easy incorporation into reports and papers.
  • Published the Python package to PyPi and user documentation to readthedocs.org

Outcome: These packages were developed to assist anyone who wants to bootstrap and visualize sampling distributions with different estimators during statistical analysis. They take care of the tedious tasks of bootstrapping and visualization and help the user focus on the actual analysis.

Forest Fire Area Prediction

GitHub

Deliverable: A regression model to predict the size of a forest fire based on meteorological and soil moisture records.

Contributions:

  • Developed a python script to pre-process data and remove outliers by Cook's Distance method.
  • Implemented a python script to train, tune, and cross-validate a support vector regression model.
  • Built a script to evaluate the tuned model on unseen data and generate score tables.
  • Tested the python scripts in the data analysis pipeline using the pytest framework.
  • Established git workflow practices and task management using Trello.

Outcome: Reduced the unseen data error to 8.68 ha when the original data's target had a range of values from 0 to 1090.84 hectares.

Earthosys - Tsunami Alert System

GitHub

Deliverables: A predictive model to classify whether an earthquake can cause a tsunami using seismographic data from NASA and NOAA and an alert system to warn the nearest coast.

Contributions:

  • Worked on data cleaning, pre-processing, and merging data from different sources.
  • Developed a search algorithm to efficiently find the latitude and longitude coordinates of the nearest coastal point from the earthquake source point in the sea.
  • Built a binary classification model to classify earthquakes as tsunamigenic and non-tsunamigenic using Random Forests.
  • Developed a website, a chatbot to present the prediction results based on live earthquake data and for seamless user interaction.
  • Developed a mini alert device that can be controlled by an android mobile just like a remote-controlled car using Raspberry Pi.

Outcome: Improved upon the existing fuzzy logic-based warning system with modern technologies achieving an accuracy score of 97%.