powered by kaggle

Completed • $250,000 • 173 teams

GE Flight Quest

in partnership with
Wed 28 Nov 2012
– Mon 11 Mar 2013 (22 months ago)

Final Leaderboard Set Release

« Prev
Topic
» Next
Topic

The final evaluation set is now up on the data page.

You may make a submission on the final evaluation set regardless of whether you've submitted a final model or hash. However, you are only eligible for prize money if you submitted a final model and use the same model to make predictions on this new set.

When you make a submission on this new set, a public score of "0.0000" without errors means the submission was processed correctly.

Please let us know if you find any issues with the data on this thread.

Good luck!

Hi Ben, 

I did some sanity checks on my submission (just to stay sane until the deadline) - I found that there is one flight for which predicted gate and runway arrival are totally off (differ by more than 200 minutes). I investigated the issue: it was the flight number 3670 from DTW to GSO on 28.2. I checked flight stats - the flight took off 2 minutes ahead of scheduled (gate) departure (10:20 am; runway is about 20min later) but the actual gate arrival is 02:43 on 01.3, thus the actual block time is more than 16 hours - the scheduled block time is 113 minutes!

here is the entry on flight stats: http://www.flightstats.com/go/FlightStatus/flightStatusByFlight.do?id=289751327&airlineCode=9E&flightNumber=3670

Is this flight still valid?

When you're making your final predictions, you should be applying your previously submitted model to the flights that we've asked you to make predictions on and assuming that all flights in the test set are valid.

I can't provide guidance on individual cases at this point in time (and you shouldn't be manually modifying any of your predictions). However, keep posting any potential inconsistiencies you find and we'll go through them in detail after the March 11 deadline (which may result in dropping a very small number of obvious errors from the final evaluation set).

Ben Hamner wrote:

When you're making your final predictions, you should be applying your previously submitted model to the flights that we've asked you to make predictions on and assuming that all flights in the test set are valid.

I can't provide guidance on individual cases at this point in time (and you shouldn't be manually modifying any of your predictions).

I'm absolutely aware about that (I've already submitted my predictions).

Kaggle, admins and contestants,

I don’t find it easy to write this post and maybe I’m the only one who thinks this way (in that case, just ignore me).

In my opinion in an ideal world all the rules of which flights to include in the validation set and which not should be clear before the start of the competition.

If that isn’t possible they should be clear well before the model submission deadline.

What I see happening now is that we have rules e.g:

-         no diverted flights

-         no redirected flights

-         no flights with actual_gate_departure after actual_runway_departure

-         etc.

But that next to the rules there is also still room for discussion.

I don’t find this a good development because:

a. It is an open invitation to everybody who thinks he/she makes a chance of winning to discuss flights which are possible harmful for there predictions. And this would make it a challenge of predictive modeling AND Public Relations instead of only predictive modeling.

b. It takes away the possibility for Kaggle to demonstrate that they work in an unbiased and transparent way, because in theory they could leave out flights (after March 11) which benefit one contestant more than others.

I’m NOT saying any contestant is thinking in such a way or that Kaggle or any admins are biased, we are just making it harder to proof that we’re not.

I propose we only filter the validation set according to rules which were clear before February 15th

Jules

I can agree with your opinion. But... imagine a situation when you work for 3 months on the solution and unexpected error / inconsitency in the data appears only in the FinalValidationSet. The expected performance on such observations is random in my opinion. When you revealed the diverted flights solutions the gains of the competitors varied between 0.20 and 0.40. And you can see what 0.2 difference can mean on the leaderboard.

In real life you could correct the issue with the data and it wouldn't diminish the model value. Here you start with the assumption that users will cheat (which is not true for 99.9% of the us ;)). For future competitions I would think on a formula that isn't impacted by an anticheater methodology to such extent.

My background is signal processing, and I just started to learn machine learning last year. From a point of view of time series, the compition is well designed in terms of causality. However, it dosn't take care of data consistence too much (techincally speaking, stationarity). All information are recorded by software, which are updated pierortically for sure. Espaically, engineer s in airline managment also want to impove the expected arrivals that are important for most models here. Software updates definitely make features unstationary. In fact, it is fine since it is the problem for us to face in the real word. However, we should to able to observe time-varying changes in the train set in order to provide more useful models for the hosts. 

Anyway, we do gambling on guessing if their system (including data processing flows) was changed too much or not recently;) 

Thanks Pawel. Jules, most likely no additional flights will be filtered from the final evaluation set.

However, there may be errors and inconsistencies in this data. If there's an extreme one (e.g. one that's 24 hours off), we don't want that skewing everyone's results and the error measurements (RMSE is especially sensitive to outliers) and would remove that observation.

Hi Ben,

We have a query regarding the submissions. We have used the times data from the Flight History file in our work. However, due to some particular values being MISSING in the data of the Final Evaluation Set, our predictions do not show for some flights. We can fix this issue with a small amount of code (around 5-6 lines) which:

  • Does not interfere with the model (The model files stay intact. The same files will be used)
  • Does not interfere with the calculation of the delays (The data that the model predicts on remains the same too)

This modification only comes in when the actual time is being calculated using the delays. Please clarify if this is an acceptable deviation so that we can submit accordingly. Thanks.

@Harish,

My practice was to submit not only the work dir also the git repository. If I need to fix bugs, they could easily track and judge the changes. 

Harish Krishnamurthy wrote:

Hi Ben,

We have a query regarding the submissions. We have used the times data from the Flight History file in our work. However, due to some particular values being MISSING in the data of the Final Evaluation Set, our predictions do not show for some flights. We can fix this issue with a small amount of code (around 5-6 lines) which:

  • Does not interfere with the model (The model files stay intact. The same files will be used)
  • Does not interfere with the calculation of the delays (The data that the model predicts on remains the same too)

This modification only comes in when the actual time is being calculated using the delays. Please clarify if this is an acceptable deviation so that we can submit accordingly. Thanks.

This is fine

Sorry for repeating the same question again, but what is the exact deadline for the final submission?

The details page says 11 march, but what time, what timezone?

Even the 'make a submission' page does not give the details...

Actually, why don't you just specify the UNIX time?

Hi,

We have an issue similar to Harish.
Due to a bug in our code, two of the final test days resulted in an extra column as compared to all the training days. It's most probably caused by our insufficient cleaning of the ASDI data. This happens only for two days (2013_02_22,2013_02_26) and generates one extra column (nof_Linked.Airport.Closure.s.).

This caused a call to rbind() to fail, due to column mismatch. To fix this, we had to delete this extra column. This did not involve tweaking any of the model parameters, just correcting the pre-processing script.

The Error:
[2013-03-07 00:44:38] test day: 2013_02_22
Error in rbind(deparse.level, ...) :
  numbers of columns of arguments do not match
Calls: rbind -> rbind  
Execution halted

Our fix (in the pre-processing script):
[2013-03-10 11:10:57] test day: 2013_02_22
[2013-03-10 11:10:57] 51 52
[2013-03-10 11:10:57] nof_Linked.Airport.Closure.s. in xtr not in xtrain, removing column
[…]
[2013-03-10 11:13:40] test day: 2013_02_26
[2013-03-10 11:13:40] 51 52
[2013-03-10 11:13:40] nof_Linked.Airport.Closure.s. in xtr not in xtrain, removing column


Again, please clarify if this is an acceptable deviation so that we can submit accordingly. Thanks!

vojtekb wrote:

Hi,

We have an issue similar to Harish.
Due to a bug in our code, two of the final test days resulted in an extra column as compared to all the training days. It's most probably caused by our insufficient cleaning of the ASDI data. This happens only for two days (2013_02_22,2013_02_26) and generates one extra column (nof_Linked.Airport.Closure.s.).

This caused a call to rbind() to fail, due to column mismatch. To fix this, we had to delete this extra column. This did not involve tweaking any of the model parameters, just correcting the pre-processing script.

The Error:
[2013-03-07 00:44:38] test day: 2013_02_22
Error in rbind(deparse.level, ...) :
  numbers of columns of arguments do not match
Calls: rbind -> rbind  
Execution halted

Our fix (in the pre-processing script):
[2013-03-10 11:10:57] test day: 2013_02_22
[2013-03-10 11:10:57] 51 52
[2013-03-10 11:10:57] nof_Linked.Airport.Closure.s. in xtr not in xtrain, removing column
[…]
[2013-03-10 11:13:40] test day: 2013_02_26
[2013-03-10 11:13:40] 51 52
[2013-03-10 11:13:40] nof_Linked.Airport.Closure.s. in xtr not in xtrain, removing column


Again, please clarify if this is an acceptable deviation so that we can submit accordingly. Thanks!

This is fine

vojtekb wrote:

Sorry for repeating the same question again, but what is the exact deadline for the final submission?

The details page says 11 march, but what time, what timezone?

Even the 'make a submission' page does not give the details...

Actually, why don't you just specify the UNIX time?

Deadline is 11:59 PM UTC

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?