• Ads Personalization Recommendation System

    September 25, 2023

    Engineered a robust ETL architecture employing Spark, Hadoop, and AWS Glue, enhancing data ingestion and storage process

    Crafted a hybrid model combining content-based and collaborative filtering utilizing Spark ML, AWS SageMaker, and Market Basket Analysis techniques offering tailored ads and enhanced conversion rates

    Revamped ad targeting using Redshift analytics and set up monitoring for improved ad personalization and system efficiency
  • Retrieval Augmented Generation for Improved Medical QA

    September 10, 2023

    Created a question-answering agent for a medical knowledge base, utilizing RAG approach to mitigate hallucinations in LLMs

    Established two SageMaker endpoints to optimize vector search: one employing dense embeddings from all-MiniLM-L6-v2 and other using sparse embeddings via Splade, both seamlessly integrated with Pinecone

    Deployed a Llama2 endpoint on SageMaker for improved text generation, using search-driven prompts, outperforming GPT-4
  • Enterprise Conversational AI Assistant

    August 30, 2023

    Developed an enterprise chatbot, utilizing Airflow and HuggingFace with Pinecone for seamless data management across domains

    Integrated Rasa with LangChain for intent classification, API interactivity, and advanced reasoning and summarization capabilities

    Optimized model performance with LLMCache in Redis, implemented GuardRails, and monitored via PromptLayer and LangSmith
  • Bayesian-Driven Customer Segmentation with Causal Analysis

    August 24, 2023

    Leveraged K-means and DBSCAN on the Instacart dataset for precise customer segmentation for personalized marketing strategies

    Used Propensity Score Matching and post-campaign t-tests, highlighting statistically significant control-treatment differences

    Employed Bayesian modeling using MultinomialNB for customer behavior prediction achieving an accuracy of 93.44%
  • Social Network Information Dynamics

    August 10, 2023

    Developed a social network simulation with Neo4j, Apache Spark and Apache Cassandra, and used Apache Airflow via GCP's Cloud Composer, to automate workflows for A/B testing of content dissemination strategies - influencer driven, and randomized

    Consolidated Kafka for data streaming while orchestrating with Kubernetes, and providing insights into efficacy of both strategies