File size: 2,102 Bytes
7a18309
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---

title: Hotel Booking Cancellation Risk
emoji: 🏨
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
---


# Hotel Booking Cancellation Risk Predictor

## Model description

A **HistGradientBoosting** classifier trained on the
[Hotel Booking Demand](https://www.kaggle.com/datasets/jessemostipak/hotel-booking-demand)
dataset (Antonio et al., 2019). Given eight booking parameters, the model returns the
probability that the booking will be cancelled.

**Test AUC:** 0.9067

## Intended use

Educational demo for the course *Introduction to Digital Content and Artificial Intelligence*,
Universitat de València (2025/2026). **Not intended for production use.**

## Inputs

| Field | Type | Description |
|---|---|---|
| Hotel type | Categorical | City Hotel or Resort Hotel |
| Lead time | Integer (days) | Days between booking and arrival |
| Deposit type | Categorical | No Deposit / Non Refund / Refundable |
| Market segment | Categorical | Channel through which booking was made |
| Special requests | Integer (0-5) | Number of special requests |
| Previous cancellations | Integer | Past cancellations by this guest |
| ADR | Float (€) | Average Daily Rate |
| Total nights | Integer | Length of stay |

## Output

Probability of cancellation, labelled as Low (< 30%), Medium (30–60%), or High (> 60%).

## Known limitations

- Trained on European hotel data (Portugal) from 2015–2017. May not generalise to other markets.
- Missing values for features not provided in the form are imputed with training-set medians.
- No concept drift detection — model performance may degrade over time.
- Labels are in English; the target audience may need localisation.

## Privacy & GDPR

This demo accepts no personally identifiable information (PII). No data entered in the
form is stored or logged. For a production deployment handling guest data, a full GDPR
impact assessment would be required and EU-resident cloud infrastructure should be used
(Azure West Europe or AWS eu-west regions).