-
Ads Personalization Recommendation System
Engineered a robust ETL architecture employing Spark, Hadoop, and AWS Glue, enhancing data ingestion and storage process
Crafted a hybrid model combining content-based and collaborative filtering utilizing Spark ML, AWS SageMaker, and Market Basket Analysis techniques offering tailored ads and enhanced conversion rates
Revamped ad targeting using Redshift analytics and set up monitoring for improved ad personalization and system efficiency -
Retrieval Augmented Generation for Improved Medical QA
Created a question-answering agent for a medical knowledge base, utilizing RAG approach to mitigate hallucinations in LLMs
Established two SageMaker endpoints to optimize vector search: one employing dense embeddings from all-MiniLM-L6-v2 and other using sparse embeddings via Splade, both seamlessly integrated with Pinecone
Deployed a Llama2 endpoint on SageMaker for improved text generation, using search-driven prompts, outperforming GPT-4 -
Enterprise Conversational AI Assistant
Developed an enterprise chatbot, utilizing Airflow and HuggingFace with Pinecone for seamless data management across domains
Integrated Rasa with LangChain for intent classification, API interactivity, and advanced reasoning and summarization capabilities
Optimized model performance with LLMCache in Redis, implemented GuardRails, and monitored via PromptLayer and LangSmith -
Bayesian-Driven Customer Segmentation with Causal Analysis
Leveraged K-means and DBSCAN on the Instacart dataset for precise customer segmentation for personalized marketing strategies
Used Propensity Score Matching and post-campaign t-tests, highlighting statistically significant control-treatment differences
Employed Bayesian modeling using MultinomialNB for customer behavior prediction achieving an accuracy of 93.44% -
Social Network Information Dynamics
Developed a social network simulation with Neo4j, Apache Spark and Apache Cassandra, and used Apache Airflow via GCP's Cloud Composer, to automate workflows for A/B testing of content dissemination strategies - influencer driven, and randomized
Consolidated Kafka for data streaming while orchestrating with Kubernetes, and providing insights into efficacy of both strategies