Project Overview
This project explores room booking data from an imaginary hotel chain. Using multiple datasets, I analyzed patterns in bookings, customer preferences, and operational bottlenecks like overbookings and capacity limits.
The goal was to clean, analyze, and visualize real-world booking data to derive insights that a hotel manager or booking platform could use to optimize operations and increase booking which are dropping currently.
Dataset Breakdown
(Note: The dataset was aquired as process of learning from codebasics.io)
The analysis was done on five core CSV files:
-
Dim_Date.csv: Calendar info (3 months) -
Dim_Hotels.csv: All. Hotel/property info. -
Dim_rooms.csv: Room type categories. [Standard, Elite, Premium, Presidential] -
fact_booking.csv: Guest booking details fact_aggregated_bookings.csv: Aggreagated data on bookings
Technical Analysis
The Full Story
Working on this project was like stepping into the shoes of a consultant for a struggling hospitality group. Through Codebasics.io’s bootcamp, I was introduced to AtliQ Grands, a fictional luxury hotel chain with operations across four Indian cities. They had one goal: regain their market share and boost revenue using data-backed strategies. As an aspiring data analyst, I took on the challenge.
Understanding the Challenge
AtliQ Grands, with a presence in four major Indian cities, was experiencing a decline in market share and revenue within the luxury/business hotel segment. The management recognized the need for data-driven insights to inform strategic decisions but lacked an in-house analytics team. This is where I stepped in, tasked with analyzing their historical data to uncover actionable insights.
I utilized Python’s pandas library to clean and merge these datasets, ensuring consistency and readiness for analysis.
Data Cleaning and Preparation
The datasets were generally clean, but a few necessary steps were performed:
-
Removed unnecessary white spaces and ensured uniform formatting of string fields.
-
Converted date fields (e.g.,
booking_date,check_in_date,checkout_date) to datetime objects. -
Checked for nulls and duplicates, especially in
fact_bookings. -
Normalized values across categorical columns to maintain consistency (e.g., room class capitalization).
Data Transformation
To extract actionable insights, I created new columns and merged datasets:
-
Joined room and hotel metadata into the bookings dataset to enrich each row with city and category information.
-
Engineered new metrics:
-
Occupancy Rate = booked rooms / room capacity
-
Revenue Realized vs. Revenue Generated to analyze booking fulfillment.
-
-
Converted daily-level bookings to monthly aggregates for trend analysis.
-
Flagged weekdays vs. weekends using
dim_date.
Deep-Dive Analyses
Room Type Analysis
Rooms were categorized into:
-
Elite
-
Executive
-
Premium
-
Suite
Elite rooms emerged as the top choice, indicating a strong preference for upper-tier comfort. They also brought in the highest revenue per booking. Conversely, Executive rooms had a higher booking count but lower per-unit revenue.
Booking Platform Insights
The key platforms included:
-
Makeyourtrip
-
Bookusnow
-
Stayzilla
-
Direct Booking
Makeyourtrip was the most dominant, accounting for a significant portion of revenue. However, direct bookings showed the highest revenue realized per booking, suggesting better profit margins. This insight hinted at promoting direct bookings through loyalty programs.
Ratings Distribution
Ratings ranged from 1 to 5. Most guests rated their stay either 4 or 5, suggesting overall satisfaction. However, properties with consistent low ratings (particularly in tier-2 cities) need immediate attention.
Bar charts and descriptive statistics revealed:
-
A positive correlation between higher ratings and room revenue.
-
Low-rated bookings had a higher cancellation rate.
Booking Types and Statuses
Booking statuses included:
-
Confirmed
-
Cancelled
-
No-show
Cancellations formed around 18% of total bookings. Many cancellations occurred for Premium rooms, indicating a possible pricing or expectation mismatch. This suggests reconsidering cancellation policies or overbooking strategies.
Analytical Approach
I followed a structured approach:
Data Enrichment: Combined dimensions to add context.
Exploratory Data Analysis: Used bar plots, .describe(), and groupby aggregations.
KPI Calculations:
-
-
Occupancy Rate
-
Average Daily Rate (ADR)
-
Revenue per Available Room (RevPAR)
-
Cancellation Rate
-
Comparative Analysis: Revenue and performance sliced by city, hotel, room class, and platform.
Root Cause Identification: Highlighted underperforming properties and overbooked segments.
Key Findings
-
Mumbai generated the highest total revenue (40%+), making it the most critical market.
-
AtliQ Exotica, a luxury property, stood out as a top performer.
-
Elite rooms were the most booked and yielded the highest revenue.
-
Makeyourtrip dominated in volume, but Direct Bookings were more profitable.
-
Weekend occupancy was consistently higher — perfect for implementing dynamic pricing.
-
Ratings were generally positive, but specific hotels required quality improvements.
-
Cancellations and no-shows impacted revenue by nearly 20%.
Recommendations
Dynamic Pricing: Implement higher pricing for weekends and high-occupancy periods.
Promote Direct Bookings: Offer incentives for loyalty or app bookings to reduce commission losses.
Focus on Elite Rooms: Tailor marketing strategies around the most preferred room class.
Targeted Improvement Campaigns: Use rating data to improve underperforming properties.
Revise Cancellation Policies: Introduce stricter or more dynamic cancellation windows based on room type.
Final Thoughts
This project taught me how data can truly transform business strategy. From cleaning messy files to extracting KPIs and recommending strategic changes, the AtliQ Grands case sharpened my end-to-end analytics workflow. Hospitality is a nuanced industry — blending customer experience with profitability — and I’m glad I could add value with data.
