👋 Hi, I'm Nayan. I'm a

Data Scientist & Applied ML Engineer.

Transforming raw data into actionable inference by designing scalable ML pipelines and deep learning models for real-world system optimization

Resume Preview

Download PDF PDF

Nayan Darokar

+91-7415440094 | reachout.nayan@gmail.com | https://github.com/Nayann23 | LinkedIn

Technical Skills

Languages & Database:

Python, SQL, PostgreSQL

Machine Learning & AI:

Scikit-learn, XGBoost, LightGBM, TensorFlow, PyTorch, NLP (NLTK, Sentence Transformers)

Data Processing & Visualization:

Pandas, NumPy, Matplotlib, Seaborn, Plotly

ML Engineering & MLOps:

Config-driven ML pipelines (YAML), Feature preprocessing, Model evaluation, Hyperparameter tuning, Experiment tracking, Model serialization (Pickle, Joblib)

Frameworks & Tools:

Flask, Streamlit, Git, GitHub, Docker (basic), VS Code, Jupyter

Projects

RiskFlow v2.0 — Customer Churn Intelligence Platform

Designed an end-to-end production-styled churn prediction system using structured banking data
Built a config-driven ML pipeline with preprocessing, model inference, and evaluation isolated from UI logic
Implemented probability-based risk scoring with interpretable outputs and controlled inference flow
Optimized inference using Gradient Boosting to meet strict memory and latency constraints
Deployed an enterprise-styled analytical console simulating real-world decision-support systems
Code: GitHub (on request / limited access) Docs
Live App: https://riskflow.onrender.com

VectorCine AI — Vector Similarity–Based Recommendation Engine

Developed a content-based movie recommendation engine using vector embeddings and cosine similarity
Precomputed and served high-dimensional similarity vectors for low-latency interactive inference
Designed a system-oriented interface emphasizing observability, trust, and controlled user interaction
Addressed real deployment constraints by transparently handling large ML artifacts and memory limits
Architected the system to support future embedding upgrades without interface changes
Implemented IDE-inspired ghost autocomplete to reduce query friction and input errors during similarity search
Code: GitHub (on request / limited access) Docs

Certificates

Data Science, ML, DL & NLP Bootcamp – Krish Naik (Udemy): Learned data preprocessing, model building, evaluation, and deployment using Python, Scikit-learn, TensorFlow, and NLP techniques. View Certificate
SQL Intermediate HackerRank: Gained proficiency in complex queries, joins, subqueries, and data filtering. View Certificate
Python Machine Learning: From Beginner to Pro (Udemy): Built ML models using supervised & unsupervised learning with Python. View Certificate

Education

ACEM Savitribai Phule University

Aug 2020 – Sep 2024

B.Tech (Computer Science Engineering)

Pune, Maharashtra

SYS_PROFILE // DATA_PIPELINE // EPOCH_01

About Me

Bridging
Data &
Design.

User Vector X: 000 // Y: 000

"Building systems that don't just process information, but learn from it."

01 // The Foundation

From software engineering to applied data science.

I started my journey building full-stack MERN applications. Today, I use that engineering background to ensure the machine learning models I train are actually built to be deployed, scaled, and integrated smoothly into production.

ML Engineering Python Ecosystem

02 // Current Focus

Accuracy means nothing without deployment.

I care just as much about clean data pipelines as I do about algorithmic precision. My current work is heavily focused on predictive modeling, NLP, and designing functional RAG architectures.

Predictive Modeling NLP & RAG Systems

The Arsenal

Technical
Capabilities.

Production-ready data science and engineering. A curated stack of frameworks, languages, and methodologies built for scale and high-fidelity insights.

🧰

Core Languages & DBs

Production-ready code.

Python SQL PostgreSQL SQLite

📊

Analytics & Viz

Stats & storytelling.

Numpy Pandas Matplotlib Seaborn Plotly Power BI EDA

🤖

Machine Learning Algorithms

Classical algorithms for tabular problems & data architecture.

Regression Decision Tree Random Forest SVM K-NN Naive Bayes Gradient Boosting Feature Engineering

🛠️

ML Libraries

Fast, reliable models.

Scikit-Learn TensorFlow XGBoost LightGBM NLTK

🧠

Unsupervised & DL

Embeddings & NLP.

ANN CNN NLP Word Embeddings TF-IDF

🚀

Deployment & Evaluation

Metrics, serialization, and production patterns.

Model Deployment Streamlit Flask Docker Git

F1-Score Precision Recall ROC-AUC

Portfolio

Selected
Works.

Predictive models, NLP pipelines, and scalable data architectures built for performance and real-world impact.

Initial load may take 60–80 seconds due to cold-start infrastructure Hosted on Render free tier Initial load may take 60–80 seconds due to cold-start infrastructure Hosted on Render free tier

Explore Projects

Click on cards for technical deep-dives

Deployment Information

Hosted on Render (cold-start enabled).

First request may take ~60-80 seconds if the service is idle. Please be patient while the infrastructure spins up.

Deep Dives

Data
Explorations.

In-depth EDA and business insights. Uncovering the hidden narratives and structural truths within raw datasets.

View Analysis Click cards to explore full EDA reports

The Proof

Verified
Credentials.

Continuous learning is the foundation of data science. These are the documented milestones of my technical evolution.

🧠

July 15, 2025

Data Science, ML & NLP Bootcamp

Krish Naik

Learned data preprocessing, model building, evaluation, and deployment using Python, Scikit-learn, TensorFlow, and advanced NLP techniques.

View Certificate

🗄️

Jan 31, 2025

SQL Intermediate Skills Certification

HackerRank

Gained strong proficiency in writing complex queries, performing joins, subqueries, and advanced data filtering techniques.

View Certificate

🤖

Feb 09, 2025

Python Machine Learning: Beginner to Pro

Udemy

Built robust ML models using supervised and unsupervised learning techniques with Python and applied them in practical projects.

View Certificate

Engineering Retrospective

System
Architecture.

// SYSTEM_STATUS: ONLINE Built like a product.
Engineered as a system.

~/docs/incident-report-11k.md

The 11,000-Line Crucible

Most portfolios are static templates. This architecture was custom-engineered from scratch to test the limits of browser performance. I explicitly chose pure Vanilla JS over React to gain absolute control over the render cycle. Managing an 11,000+ line codebase without a framework was brutal—it led to severe layout shifts—but it forced me to isolate the game loop and achieve 60fps stability.

It wasn't overengineering; it was craftsmanship.

Architectural Trade-Off

Sacrificed: Dev Speed (No React V-DOM) → Gained: Raw Performance & Render Control

The Hybrid Edge

My time as a MERN Stack Developer ingrained the discipline to build robust interfaces. Now, as a Data Scientist, I don't just display data—I build the logic that parses it.

Frontend Logic

+ Machine Learning

🧠

SYS.MODULE.01

Custom RAG Pipeline

Inspired by modern browser capabilities, I built my own context-aware recruiter assistant. Powered by Vector Embeddings and a custom Retrieval-Augmented Generation (RAG) pipeline, the system matches projects to queries and ensures zero-hallucination responses.

Vector Embeddings Zero-Hallucination

⚡

SYS.MODULE.02

State-Driven UI

Engineered a centralized state management store to handle a "Dynamic Island" style UI. This context-aware switching reduces viewport clutter by 40%. Paired with GSAP and Lenis, motion serves structural storytelling, not just aesthetics.

GSAP + Lenis Context-Aware Routing

The Builder's
Philosophy

"When something inspires me, I don’t wait for access. I build my own version, test the constraints, and make it better. Function implies form."

The Changelog

Journey &
Evolution.

You might wonder why this portfolio feels so modern and polished — it reflects my experience as a former full-stack developer and my transition into Data Science.

It combines strict engineering discipline with machine learning expertise.

Education B.Tech — CSE • Graduated 2024

Pre-2024

JARVIS OS Project

Independent Development

▹ Built a desktop virtual assistant (JARVIS OS)
▹ Implemented voice-based control & automation
▹ Designed a modular and scalable architecture
▹ Added voice authentication for security
▹ Independently developed over 720+ hours

Early 2024

MERN Stack Developer

Full-Stack Architecture

▹ Developed full-stack web applications using MERN
▹ Built a UBER clone with AI assistance
▹ Created an employee management system
▹ Implemented APIs & authentication
▹ Designed responsive UI using React

MongoDB Express React Node.js

Mid 2024

Remote Internships

MERN Stack Developer

▹ Worked with distributed remote teams
▹ Built authentication systems
▹ Developed pizza ordering websites
▹ Debugged APIs and backend services
▹ Used Git & GitHub for version control

Late 2024

Transition to Data Science

ML & NLP Focus

▹ Shifted from full-stack to data science
▹ Mastered Python Data Stack (Pandas, NumPy, Scikit-Learn)
▹ Specialized in NLP architectures & Predictive Modeling
▹ Engineered robust features for high-accuracy models
▹ Applied software discipline to experimental ML workflows

The Pivot

Jan 2025 – Present

Data Science Practitioner

Deep Learning & Architecture

▹ Actively seeking opportunities in Data Science & AI
▹ Architecting end-to-end Deep Learning solutions
▹ Exploring Large Language Models (LLMs) & Transformers
▹ Refining skills in advanced algorithms
▹ Translating complex data into actionable business insights

Status: Ready

The Details

Common
Queries.

Background, availability, and my approach to applied machine learning and system engineering.

Nayan’s background in Full Stack development gave him a solid engineering foundation and an appreciation for how complete applications are built. Over time, he became more interested in the parts of systems where data informs decisions. Working with mathematics, statistics, and machine learning on real-world problems felt more engaging to him, and today he focuses on combining his engineering experience with applied machine learning to build practical, usable solutions.

Nayan is currently available to join immediately. His academic commitments are complete, and he is fully focused on building real-world ML systems and pursuing full-time roles. He is open to both remote opportunities and relocation within Pune and Nagpur, Maharashtra, India only, depending on the role and team.

Nayan’s primary focus is Applied Machine Learning. He enjoys taking models beyond notebooks and integrating them into usable products. While he keeps up with current research to stay technically sharp, his strength lies in applying those ideas to practical, high-fidelity systems.

This portfolio was designed and built entirely from scratch, not from a template. The goal was to reflect how applied ML tools are built, presented, and trusted in real-world environments — not just to look polished. This engineering-first approach is influenced by Nayan's background as a former MERN stack developer.

LET'S WORK
TOGETHER

01 // Direct

Avg Response: < 12h

Start a
Conversation

To: Nayan Darokar

Subject: Exploring Collaboration Opportunities

Hi Nayan,

I reviewed your data science portfolio and was impressed by your architectural approach. Let's schedule a time to chat about...

Network

Code

GitHub

Details

Resume .PDF

Theme

Switch Theme

Terminal Edition

Prefer Dark Mode?

Deep System Environment

Data Scientist & Applied ML Engineer.

Bridging
Data &
Design.

From software engineering to applied data science.

Accuracy means nothing without deployment.

Technical
Capabilities.

Core Languages & DBs

Analytics & Viz

Machine Learning Algorithms

ML Libraries

Unsupervised & DL

Deployment & Evaluation

Selected
Works.

Deployment Information

Data
Explorations.

Verified
Credentials.

Data Science, ML & NLP Bootcamp

SQL Intermediate Skills Certification

Python Machine Learning: Beginner to Pro

System
Architecture.

The 11,000-Line Crucible

The Hybrid Edge

Custom RAG Pipeline

State-Driven UI

The Builder's
Philosophy

Journey &
Evolution.

JARVIS OS Project

MERN Stack Developer

Remote Internships

Transition to Data Science

Data Science Practitioner

Common
Queries.

Why Data Science after Full Stack?

What is your availability?

Research or Applied ML?

How was this portfolio designed and built - is it based on a template?

LET'S WORK
TOGETHER

Data Scientist & Applied ML Engineer.

Nayan Darokar

Technical Skills

Projects

RiskFlow v2.0 — Customer Churn Intelligence Platform

VectorCine AI — Vector Similarity–Based Recommendation Engine

Certificates

Education

Bridging Data & Design.

From software engineering to applied data science.

Accuracy means nothing without deployment.

Technical Capabilities.

Core Languages & DBs

Analytics & Viz

Machine Learning Algorithms

ML Libraries

Unsupervised & DL

Deployment & Evaluation

Selected Works.

Deployment Information

Project Overview

System Architecture

Implementation Highlights

Tech Stack

Author:

Data Explorations.

Methodology

Core Tools

Verified Credentials.

Data Science, ML & NLP Bootcamp

SQL Intermediate Skills Certification

Python Machine Learning: Beginner to Pro

System Architecture.

The 11,000-Line Crucible

The Hybrid Edge

Custom RAG Pipeline

State-Driven UI

The Builder'sPhilosophy

Journey & Evolution.

JARVIS OS Project

MERN Stack Developer

Remote Internships

Transition to Data Science

Data Science Practitioner

Common Queries.

Why Data Science after Full Stack?

What is your availability?

Research or Applied ML?

How was this portfolio designed and built - is it based on a template?

NAYAN DAROKAR

The Architect

Bridging
Data &
Design.

Technical
Capabilities.

Selected
Works.

Data
Explorations.

Verified
Credentials.

System
Architecture.

The Builder's
Philosophy

Journey &
Evolution.

Common
Queries.