Welcome
đź‘‹ Hi, I'm Nayan. I'm a

Data Scientist & Applied ML Engineer.

Transforming raw data into actionable inference by designing scalable ML pipelines and deep learning models for real-world system optimization

Resume Preview

Technical Skills

Languages & Database:
Python, SQL, PostgreSQL
Machine Learning & AI:
Scikit-learn, XGBoost, LightGBM, TensorFlow, PyTorch, NLP (NLTK, Sentence Transformers)
Data Processing & Visualization:
Pandas, NumPy, Matplotlib, Seaborn, Plotly
ML Engineering & MLOps:
Config-driven ML pipelines (YAML), Feature preprocessing, Model evaluation, Hyperparameter tuning, Experiment tracking, Model serialization (Pickle, Joblib)
Frameworks & Tools:
Flask, Streamlit, Git, GitHub, Docker (basic), VS Code, Jupyter

Projects

RiskFlow v2.0 — Customer Churn Intelligence Platform

  • Designed an end-to-end production-styled churn prediction system using structured banking data
  • Built a config-driven ML pipeline with preprocessing, model inference, and evaluation isolated from UI logic
  • Implemented probability-based risk scoring with interpretable outputs and controlled inference flow
  • Optimized inference using Gradient Boosting to meet strict memory and latency constraints
  • Deployed an enterprise-styled analytical console simulating real-world decision-support systems
  • Code: GitHub (on request / limited access) Docs
  • Live App: https://riskflow.onrender.com

VectorCine AI — Vector Similarity–Based Recommendation Engine

  • Developed a content-based movie recommendation engine using vector embeddings and cosine similarity
  • Precomputed and served high-dimensional similarity vectors for low-latency interactive inference
  • Designed a system-oriented interface emphasizing observability, trust, and controlled user interaction
  • Addressed real deployment constraints by transparently handling large ML artifacts and memory limits
  • Architected the system to support future embedding upgrades without interface changes
  • Implemented IDE-inspired ghost autocomplete to reduce query friction and input errors during similarity search
  • Code: GitHub (on request / limited access) Docs

Certificates

  • Data Science, ML, DL & NLP Bootcamp – Krish Naik (Udemy): Learned data preprocessing, model building, evaluation, and deployment using Python, Scikit-learn, TensorFlow, and NLP techniques. View Certificate
  • SQL Intermediate HackerRank: Gained proficiency in complex queries, joins, subqueries, and data filtering. View Certificate
  • Python Machine Learning: From Beginner to Pro (Udemy): Built ML models using supervised & unsupervised learning with Python. View Certificate

Education

ACEM Savitribai Phule University
Aug 2020 – Sep 2024
B.Tech (Computer Science Engineering)
Pune, Maharashtra
About Me

Bridging
Data &
Design.

01 // The Foundation

From software engineering to applied data science.

I started my journey building full-stack MERN applications. Today, I use that engineering background to ensure the machine learning models I train are actually built to be deployed, scaled, and integrated smoothly into production.

ML Engineering Python Ecosystem
02 // Current Focus

Accuracy means nothing without deployment.

I care just as much about clean data pipelines as I do about algorithmic precision. My current work is heavily focused on predictive modeling, NLP, and designing functional RAG architectures.

Predictive Modeling NLP & RAG Systems
The Arsenal

Technical
Capabilities.

Production-ready data science and engineering. A curated stack of frameworks, languages, and methodologies built for scale and high-fidelity insights.

đź§°

Core Languages & DBs

Production-ready code.

Python SQL PostgreSQL SQLite
📊

Analytics & Viz

Stats & storytelling.

Numpy Pandas Matplotlib Seaborn Plotly Power BI EDA
🤖

Machine Learning Algorithms

Classical algorithms for tabular problems & data architecture.

Regression Decision Tree Random Forest SVM K-NN Naive Bayes Gradient Boosting Feature Engineering
🛠️

ML Libraries

Fast, reliable models.

Scikit-Learn TensorFlow XGBoost LightGBM NLTK
đź§ 

Unsupervised & DL

Embeddings & NLP.

ANN CNN NLP Word Embeddings TF-IDF
🚀

Deployment & Evaluation

Metrics, serialization, and production patterns.

Model Deployment Streamlit Flask Docker Git
F1-Score Precision Recall ROC-AUC
Portfolio

Selected
Works.

Predictive models, NLP pipelines, and scalable data architectures built for performance and real-world impact.

Explore Projects
Click on cards for technical deep-dives

Deployment Information

Hosted on Render (cold-start enabled).

First request may take ~60-80 seconds if the service is idle. Please be patient while the infrastructure spins up.

Timeline
Role
Team
Status

Project Overview

System Architecture

Implementation Highlights

    Tech Stack

    Author:

    Applied Data Scientist — Intelligent Interfaces & ML Systems Engineering

    Deep Dives

    Data
    Explorations.

    In-depth EDA and business insights. Uncovering the hidden narratives and structural truths within raw datasets.

    View Analysis Click cards to explore full EDA reports

    Methodology

    Core Tools

    The Proof

    Verified
    Credentials.

    Continuous learning is the foundation of data science. These are the documented milestones of my technical evolution.

    đź§ 
    July 15, 2025

    Data Science, ML & NLP Bootcamp

    Krish Naik

    Learned data preprocessing, model building, evaluation, and deployment using Python, Scikit-learn, TensorFlow, and advanced NLP techniques.

    View Certificate
    🗄️
    Jan 31, 2025

    SQL Intermediate Skills Certification

    HackerRank

    Gained strong proficiency in writing complex queries, performing joins, subqueries, and advanced data filtering techniques.

    View Certificate
    🤖
    Feb 09, 2025

    Python Machine Learning: Beginner to Pro

    Udemy

    Built robust ML models using supervised and unsupervised learning techniques with Python and applied them in practical projects.

    View Certificate
    Engineering Retrospective

    System
    Architecture.

    // SYSTEM_STATUS: ONLINE Built like a product.
    Engineered as a system.
    ~/docs/incident-report-11k.md

    The 11,000-Line Crucible

    Most portfolios are static templates. This architecture was custom-engineered from scratch to test the limits of browser performance. I explicitly chose pure Vanilla JS over React to gain absolute control over the render cycle. Managing an 11,000+ line codebase without a framework was brutal—it led to severe layout shifts—but it forced me to isolate the game loop and achieve 60fps stability.

    It wasn't overengineering; it was craftsmanship.

    Architectural Trade-Off
    Sacrificed: Dev Speed (No React V-DOM) Gained: Raw Performance & Render Control

    The Hybrid Edge

    My time as a MERN Stack Developer ingrained the discipline to build robust interfaces. Now, as a Data Scientist, I don't just display data—I build the logic that parses it.

    Frontend Logic
    + Machine Learning
    đź§ 
    SYS.MODULE.01

    Custom RAG Pipeline

    Inspired by modern browser capabilities, I built my own context-aware recruiter assistant. Powered by Vector Embeddings and a custom Retrieval-Augmented Generation (RAG) pipeline, the system matches projects to queries and ensures zero-hallucination responses.

    Vector Embeddings Zero-Hallucination
    ⚡
    SYS.MODULE.02

    State-Driven UI

    Engineered a centralized state management store to handle a "Dynamic Island" style UI. This context-aware switching reduces viewport clutter by 40%. Paired with GSAP and Lenis, motion serves structural storytelling, not just aesthetics.

    GSAP + Lenis Context-Aware Routing

    The Builder's
    Philosophy

    "When something inspires me, I don’t wait for access. I build my own version, test the constraints, and make it better. Function implies form."
    The Changelog

    Journey &
    Evolution.

    You might wonder why this portfolio feels so modern and polished — it reflects my experience as a former full-stack developer and my transition into Data Science.

    It combines strict engineering discipline with machine learning expertise.

    Education B.Tech — CSE • Graduated 2024
    01
    Pre-2024

    JARVIS OS Project

    Independent Development

    • â–ą Built a desktop virtual assistant (JARVIS OS)
    • â–ą Implemented voice-based control & automation
    • â–ą Designed a modular and scalable architecture
    • â–ą Added voice authentication for security
    • â–ą Independently developed over 720+ hours
    02
    Early 2024

    MERN Stack Developer

    Full-Stack Architecture

    • â–ą Developed full-stack web applications using MERN
    • â–ą Built a UBER clone with AI assistance
    • â–ą Created an employee management system
    • â–ą Implemented APIs & authentication
    • â–ą Designed responsive UI using React
    MongoDB Express React Node.js
    03
    Mid 2024

    Remote Internships

    MERN Stack Developer

    • â–ą Worked with distributed remote teams
    • â–ą Built authentication systems
    • â–ą Developed pizza ordering websites
    • â–ą Debugged APIs and backend services
    • â–ą Used Git & GitHub for version control
    04
    Late 2024

    Transition to Data Science

    ML & NLP Focus

    • â–ą Shifted from full-stack to data science
    • â–ą Mastered Python Data Stack (Pandas, NumPy, Scikit-Learn)
    • â–ą Specialized in NLP architectures & Predictive Modeling
    • â–ą Engineered robust features for high-accuracy models
    • â–ą Applied software discipline to experimental ML workflows
    The Pivot
    05
    Jan 2025 – Present

    Data Science Practitioner

    Deep Learning & Architecture

    • â–ą Actively seeking opportunities in Data Science & AI
    • â–ą Architecting end-to-end Deep Learning solutions
    • â–ą Exploring Large Language Models (LLMs) & Transformers
    • â–ą Refining skills in advanced algorithms
    • â–ą Translating complex data into actionable business insights
    Status: Ready
    The Details

    Common
    Queries.

    Background, availability, and my approach to applied machine learning and system engineering.

    Nayan’s background in Full Stack development gave him a solid engineering foundation and an appreciation for how complete applications are built. Over time, he became more interested in the parts of systems where data informs decisions. Working with mathematics, statistics, and machine learning on real-world problems felt more engaging to him, and today he focuses on combining his engineering experience with applied machine learning to build practical, usable solutions.

    Nayan is currently available to join immediately. His academic commitments are complete, and he is fully focused on building real-world ML systems and pursuing full-time roles. He is open to both remote opportunities and relocation within Pune and Nagpur, Maharashtra, India only, depending on the role and team.

    Nayan’s primary focus is Applied Machine Learning. He enjoys taking models beyond notebooks and integrating them into usable products. While he keeps up with current research to stay technically sharp, his strength lies in applying those ideas to practical, high-fidelity systems.

    This portfolio was designed and built entirely from scratch, not from a template. The goal was to reflect how applied ML tools are built, presented, and trusted in real-world environments — not just to look polished. This engineering-first approach is influenced by Nayan's background as a former MERN stack developer.

    LET'S WORK
    TOGETHER

    Nayan Darokar

    NAYAN DAROKAR

    SYS_ADMIN
    /// V.2.0.4

    The Architect

    I am the deterministic logic behind this ecosystem. As an Applied Data Scientist, I engineer neural architectures and scalable pipelines—transforming raw chaos into high-fidelity intelligence.

    > Accuracy is nothing without deployment.

    Current Location

    Pune Maharashtra, India

    Status

    Deployed

    Project Completed

    12 + Live Model

    Portfolio

    Version 2

    [ REQUEST INITIATED ] Connecting to secure server...
    Routing
    Establishing secure connection...
    Target: Nayan V1 Portfolio