Vous êtes sur la page 1sur 5

10/3/2015

DataPersonalizeExpediaHotelSearchesICDM2013|Kaggle
Host

Competitions

Scripts

Jobs

Community

Jaydeep

Logout

Completed $25,000 337 teams

Personalize Expedia Hotel Searches - ICDM 2013


Tue 3 Sep 2013 Mon 4 Nov 2013 (23 months ago)

Dashboard
Home
Data
Make a submission
Information
Description
Evaluation
Rules
Prizes
Timeline
Winners

Forum

Competition Details Get the Data Make a submission


Data Files
File Name

Available Formats

data

.zip (414.44 mb)

basicPythonBenchmark

.zip (27.14 mb)

testOrderBenchmark

.zip (25.98 mb)

randomBenchmark

.zip (27.16 mb)

Leaderboard
Public
Private

My Team
GitHub
My Submissions

Leaderboard
1. commendo - part of Opera
Solutions
2. Owen
3. Jun Wang@UniGe
4. idle_speculation
5. binghsu & MLRush &
BrickMover
6. J.A. Guerrero
7. AuroraXie
8. n_m & Pacific Rim
9. Gxav
10. Leustagos & Dmitry Efimov

Sample code to create the benchmarks is available on Github.


train.csv - the training set
test.csv- the test set (this contains data for both the public leaderboard
and the final evaluation, which is randomly split between the two sets)
Note: test.csv does not contain the following columns: position, click_bool,
gross_bookings_usd, nor booking_bool
You can refer to www.expedia.com to better understand hotel search.
Hotel refers to hotels, apartments, B&Bs, hostels and other properties appearing on
Expedias websites. Room types are not distinguished and the data can be assumed to
apply to the least expensive room type.
Most of the data are for searches that resulted in a purchase, but a small proportion
are for searches not leading to a purchase.
Usage of outside data is prohibited and modeling should focus fully on the given
data.
ColumnName

DataType

Description

srch_id
date_time

Integer
Date/time

TheIDofthesearch
Dateandtimeofthesearch

site_id

Integer

opaque offers

visitor_location_country_id

Integer

Data Usage beyond the contest

visitor_hist_starrating

Float

IDoftheExpediapointofsale(i.e.

Expedia.com,Expedia.co.uk,
Expedia.co.jp,..)
TheIDofthecountrythecustomer

islocated
Themeanstarratingofhotelsthe
customerhaspreviouslypurchased

nullsignifiesthereisnopurchase

Forum (87 topics)


submission
36 days ago

61 days ago

3 months ago

training/test set variable


difference randomForest (R)

https://www.kaggle.com/c/expediapersonalizedsort/data

1/5

10/3/2015

DataPersonalizeExpediaHotelSearchesICDM2013|Kaggle

difference randomForest (R)


4 months ago

prop_location_score1 and
prop_location_score2

visitor_hist_adr_usd

Float

6 months ago

Winning algorithm
15 months ago

teams
players

prop_country_id

Integer

prop_id
prop_starrating

Integer
Integer

entries

prop_review_score

prop_brand_bool

Float

Integer

prop_location_score1

Float

prop_location_score2

Float

prop_log_historical_price

Float

position
Integer

price_usd

Float

promotion_flag

Integer

gross_booking_usd

Float

srch_destination_id

Integer

srch_length_of_stay

Integer

srch_booking_window

Integer

srch_adults_count

https://www.kaggle.com/c/expediapersonalizedsort/data

Integer

historyonthecustomer
Themeanpricepernight(inUS$)
ofthehotelsthecustomerhas
previouslypurchasednullsignifies
thereisnopurchasehistoryonthe
customer
TheIDofthecountrythehotelis
locatedin
TheIDofthehotel
Thestarratingofthehotel,from1
to5,inincrementsof1.A0
indicatesthepropertyhasnostars,
thestarratingisnotknownor
cannotbepublicized.
Themeancustomerreviewscore
forthehotelonascaleoutof5,
roundedto0.5increments.A0
meanstherehavebeennoreviews,
nullthattheinformationisnot
available.
+1ifthehotelispartofamajor
hotelchain0ifitisanindependent
hotel
A(first)scoreoutliningthe
desirabilityofahotelslocation
A(second)scoreoutliningthe
desirabilityofthehotelslocation
Thelogarithmofthemeanpriceof
thehoteloverthelasttrading
period.A0willoccurifthehotel
wasnotsoldinthatperiod.
HotelpositiononExpedia'ssearch
resultspage.Thisisonlyprovided
forthetrainingdata,butnotthetest
data.
Displayedpriceofthehotelforthe
givensearch.Notethatdifferent
countrieshavedifferentconventions
regardingdisplayingtaxesandfees
andthevaluemaybepernightor
forthewholestay
+1ifthehotelhadasaleprice
promotionspecificallydisplayed
Totalvalueofthetransaction.This
candifferfromtheprice_usddueto
taxes,fees,conventionsonmultiple
daybookingsandpurchaseofa
roomtypeotherthantheoneshown
inthesearch
IDofthedestinationwherethehotel
searchwasperformed
Numberofnightsstaythatwas
searched
Numberofdaysinthefuturethe
hotelstaystartedfromthesearch
date
Thenumberofadultsspecifiedin
thehotelroom

2/5

10/3/2015

DataPersonalizeExpediaHotelSearchesICDM2013|Kaggle

srch_children_count

Integer

srch_room_count

Integer

srch_saturday_night_bool

Boolean

srch_query_affinity_score

orig_destination_distance

random_bool

Float

Float

Boolean

comp1_rate

Integer

comp1_inv

comp1_rate_percent_diff

comp2_rate
comp2_inv

Thenumberof(extraoccupancy)
childrenspecifiedinthehotelroom
Numberofhotelroomsspecifiedin
thesearch
+1ifthestayincludesaSaturday
night,startsfromThursdaywitha
lengthofstayislessthanorequal
to4nights(i.e.weekend)otherwise
0
Thelogoftheprobabilityahotelwill
beclickedoninInternetsearches
(hencethevaluesarenegative)A
nullsignifiestherearenodata(i.e.
hoteldidnotregisterinany
searches)
Physicaldistancebetweenthehotel
andthecustomeratthetimeof
search.Anullmeansthedistance
couldnotbecalculated.
+1whenthedisplayedsortwas
random,0whenthenormalsort
orderwasdisplayed

+1ifExpediahasalowerpricethan
competitor1forthehotel0ifthe
same1ifExpediaspriceishigher
thancompetitor1nullsignifies
thereisnocompetitivedata
+1ifcompetitor1doesnothave
availabilityinthehotel0ifboth
Expediaandcompetitor1have
availabilitynullsignifiesthereisno
competitivedata
Theabsolutepercentagedifference
(ifoneexists)betweenExpediaand
competitor1sprice(Expediasprice
thedenominator)nullsignifiesthere
isnocompetitivedata

Integer

Float

(same,forcompetitor2through8)

comp2_rate_percent_diff

comp3_rate

comp3_inv

comp3_rate_percent_diff

comp4_rate

comp4_inv
comp4_rate_percent_diff

comp5_rate
comp5_inv

comp5_rate_percent_diff

comp6_rate

comp6_inv
comp6_rate_percent_diff

comp7_rate
comp7_inv

https://www.kaggle.com/c/expediapersonalizedsort/data

3/5

10/3/2015

DataPersonalizeExpediaHotelSearchesICDM2013|Kaggle

comp7_rate_percent_diff

comp8_rate
comp8_inv
comp8_rate_percent_diff

https://www.kaggle.com/c/expediapersonalizedsort/data

4/5

10/3/2015

2015 Kaggle Inc

DataPersonalizeExpediaHotelSearchesICDM2013|Kaggle

About Our Team Careers Terms Privacy Contact/Support

https://www.kaggle.com/c/expediapersonalizedsort/data

5/5