powered by kaggle

Completed • $250,000 • 173 teams

GE Flight Quest

in partnership with
Wed 28 Nov 2012
– Mon 11 Mar 2013 (18 months ago)

Deadlines

view timeline »
  • November 29, 2012
    Competition Launch
  • December 18, 2012
    Leaderboard Activated
  • February 14, 2013
    Model Submission Deadline
  • March 4, 2013
    Final Data Released
  • March 11, 2013
    Final Submission Deadline

Prize Pool

view prizes »

1st $100,000
2nd $50,000
3rd $40,000
4th $30,000
5th $20,000 

LSU Prize ?

$10,000

Data Files

File Name Available Formats
Reference .7z (141.76 kb)
.gz (174.45 kb)
.zip (173.70 kb)
InitialTrainingSet_rev1 .7z (1.18 gb)
.zip (2.95 gb)
PublicLeaderboardSet .7z (608.24 mb)
.zip (1.49 gb)
scheduled_arrival_benchmark .csv (536.87 kb)
estimated_arrival_benchmark_rev1 .csv (536.80 kb)
Key .txt (100 b)
AugmentedTrainingSet1 .7z (2.48 gb)
.zip (6.21 gb)
redirected_or_diverted_flights_solution .csv (560 b)
PublicLeaderboardTrainDays .7z (1.29 gb)
.zip (3.23 gb)
AugmentedTrainingSet2 .zip (8.82 gb)
AugmentedTrainingSet2_take2 .7z (3.57 gb)
test .csv (285.50 kb)
FinalEvaluationSet .7z (781.36 mb)
.zip (1.85 gb)

You only need to download one format of each file.
Each has the same contents but use different packaging methods.

 The Flight Quest data is temporally split into five datasets that will be released over the course of the competition. The names and dates of the flight data contained in each dataset are as follows:

  • Initial Training Set: November 12, 2012 - November 25, 2012 (14 days)
  • Public Leaderboard Set: November 26, 2012 - December 9, 2012 (14 days)
  • Augmented Training Set, Part 1: December 10, 2012 - January 2, 2012 (24 days)
  • Augmented Training Set, Part 2: January 3, 2013 - February 6, 2013 (35 days)
  • Final Evaluation Set: February 15, 2013 - February 28, 2013 (14 days)
  • The initial and augmented training sets contain all the flight and weather information for domestic US flights during the corresponding days. The public leaderboard set and final evaluation sets have this information for each day up to a predetermined cutoff time.

For each day in the test period (first for the public leaderboard, and later for the final evaluation), we will select a random time (uniformly chosen between 9am EST and 9pm EST) and select all of the flights in the air at that cutoff time. You will be provided with relevant data for each day that would be available at the chosen cutoff time.

Your model must be structured so that it makes each test day's final test data set predictions based on no information in the final evaluation test data other than the information from that day, which will be in an appropriately named folder. (Reworded for clarification on 11/30/2012. See forum for explanation.

The relevant data is divided into folders by day. For these purposes, a "day" includes all of the time from 1amPST/4amEST/9amUTC on that day until the same time on the next day. So Nov. 11, 2012 is considered to be from 9am UTC on Nov. 11, 2012 until 9am UTC on Nov. 12, 2012.

For a fuller description of the data, visit the Flight Quest Data wiki (information on the wiki is solely for background information and does not form part of the official documentation for Flight Quest).

Data provided by FlightStats, Inc., and other sources.