--- title: Hotel Booking Cancellation Risk emoji: 🏨 colorFrom: blue colorTo: red sdk: gradio sdk_version: "5.0.0" python_version: "3.11" app_file: app.py pinned: false --- # Hotel Booking Cancellation Risk Predictor ## Model description A **HistGradientBoosting** classifier trained on the [Hotel Booking Demand](https://www.kaggle.com/datasets/jessemostipak/hotel-booking-demand) dataset (Antonio et al., 2019). Given eight booking parameters, the model returns the probability that the booking will be cancelled. **Test AUC:** 0.9067 ## Intended use Educational demo for the course *Introduction to Digital Content and Artificial Intelligence*, Universitat de València (2025/2026). **Not intended for production use.** ## Inputs | Field | Type | Description | |---|---|---| | Hotel type | Categorical | City Hotel or Resort Hotel | | Lead time | Integer (days) | Days between booking and arrival | | Deposit type | Categorical | No Deposit / Non Refund / Refundable | | Market segment | Categorical | Channel through which booking was made | | Special requests | Integer (0-5) | Number of special requests | | Previous cancellations | Integer | Past cancellations by this guest | | ADR | Float (€) | Average Daily Rate | | Total nights | Integer | Length of stay | ## Output Probability of cancellation, labelled as Low (< 30%), Medium (30–60%), or High (> 60%). ## Known limitations - Trained on European hotel data (Portugal) from 2015–2017. May not generalise to other markets. - Missing values for features not provided in the form are imputed with training-set medians. - No concept drift detection — model performance may degrade over time. - Labels are in English; the target audience may need localisation. ## Privacy & GDPR This demo accepts no personally identifiable information (PII). No data entered in the form is stored or logged. For a production deployment handling guest data, a full GDPR impact assessment would be required and EU-resident cloud infrastructure should be used (Azure West Europe or AWS eu-west regions).