Judge’s Responses to Submissions

November 20th, 2009 | Categories: Judging at the Contest

Here are the responses that we use in the Southeast Region:

  • Correct!
  • Compile-Time Error
  • Runtime Error
  • Time Limit Exceeded
  • Incorrect Output
  • Incorrect Format
  • Incorrect – Contact Staff

At first blush, they may seem pretty simple and straightforward. There can be some subtleties, though. let’s go through them:

Correct! pretty much means that the code produced the expected output. At SER, we use the PC2 contest management system. That system has an autojudger, which will render a verdict of Correct or Incorrect. When it says Correct, the judges usually just go with that. If it says Incorrect, we have to look at see why. But, even when it says Correct, it can be wrong. The autojudger can be set up to react in various ways to whitespace – at SER 2009, we told it to ignore whitespace. We know of at least one case (in 2009) of a submission of Minesweeper that had spaces between characters, and should have been judged Incorrect, but the autojudger said Correct.

Compile-Time Error is perhaps the most straightforward of all the judgments. The submitted program didn’t pass the compiler. We don’t see this much – when we do, it’s either at the beginning of the contest, when a team didn’t understand the system and sent a .class (for Java) or an a.out (in C++) instead of their source, or at the end, when they’re throwing Hail Marys at us. There is a subtlety here, though – what about interpreted languages? It can be tough to differentiate between Compile-time and Runtime errors.

Runtime Error encompasses your division-by-zeroes, your array-index-out-of-bounds, and such. This is also what we’ll give if the program tries to open a file (instead of reading from stdio), or runs out of memory, which aren’t logic errors per se. This seems obvious, but it can be a little tricky. The judges ignore stderr – so how do they know if a program crashed? Moreover, Java can throw exceptions but keep on running, in which case we ignore the exceptions – so, if a Java program stops producing output, is it because it crashed? Or, did it just not print the last test cases, and the exception was incidental? Was it a fatal or a nonfatal exception? What if a team has their end-of-input sensing code wrong, produces correct output and then crashes? Is it Correct! or is it a Runtime Error? We try to look at the code and make our best judgment on a case-by-case basis. However, I would not be surprised if we let a few Runtime Errors through with Correct! because we trusted the PC2 autojudger on a “Correct” and didn’t check stderr.

Time Limit Exceeded is the most controversial. It means the program ran too long – but what it “too long”? The judges set a time limit for each problem. There is not a global time limit – it is set per problem. The time limit is based on the programs written by the judges, running on the judge data. We usually take the slowest of the programs, and multiply its time by ten,  but there can be exceptions.  I know that there’s going to be a lot of discussion on this topic, so I will soon post an article on this alone.

Incorrect Output and Incorrect Format are given if your output doesn’t match ours. The old, conventional wisdom is that Incorrect Output is given if your answers are wrong, and Incorrect Format is given if your answers are right, but don’t match our desired format. For example, if you print your floating point numbers to 2 decimal places when we ask for 3, you’ll get Incorrect Format. We’ll also give Format if the output is so messed up, we can’t tell if the right answers are buried in there somewhere or not. This can happen if the program messes up white space and CR/LFs (and outputs a number salad), or if it included lots of debugging prints that the competitor forgot to disable.

Incorrect- Contact Staff is a response we use rarely. We use it when there’s something that the team just isn’t getting, and we want to give them some help. If they repeatedly submit their a.out, or their .class file, or they read from a file instead of stdio, or for some other reason they just don’t get it, we’ll give them this response, and then call the staff at their site to give them a heads-up. We’ll only use this well into the contest, and only for teams that are clearly not competing for the top spots.

So, what if more than one thing is wrong with a program? You will get… whatever the judge sees first. That’s right, there’s no prescribed hierarchy. If more than one of the incorrect judgments holds, you could submit exactly the same program multiple times and get different kinds of “incorrect” responses. If it’s correct, it’s correct, of course, but if it has wrong output, wrong format, and it crashes, you could get any of those three. A crashing program tends to be obvious, so more often than not, you’ll get Runtime Error in that case. Likewise, when a program has  a compiler error, it produces no output, so you’ll get Compiler Error. But, if your program prints the wrong answers, and prints them to 2 decimal places when we asked for three, you could get either Incorrect Output or Incorrect Format, depending on which the judge noticed first.

Checking the World Finals rules, it looks like they have reduced their responses. They’ve combined Compile-time Error with Runtime Error, and just give Run-Time Error. They’ve combined Wrong Answer (Incorrect Output) with Presentation Error (Incorrect Format) and they just give Wrong Answer. They wouldn’t give Contact Staff at Finals, so that’s it. Maybe we’ll go that way at Regionals next year – at the very least, you’ll probably see the elimination of Incorrect Format.

  1. avatar
    yiuyuho
    April 19th, 2010 at 23:43

    Hey, I finally registered for this. May be next year the site can be announced on the SER regional web page and during the contest, so more people knows about it. This is pretty good going.

You must be logged in to post a comment.