All runs consisted of three passes over progressively
smaller subsets of the collection.

  (a) On-line logistic regression over the English
      ClueWeb09 collection, using all substrings
      of the query as binary features. (Alphabetic 
      only, case insensitive).  Only the first
      35K bytes of each page was used.

      All 50 topics were processed with a single
      pass.

  (b) Same as (a) but over only the enwp (Wikipedia)
      documents.

  (c) Naive Bayes classifer, using binary byte 
      4-grams as features.  (No preprocessing
      at all, except for selection of the first
      35K bytes of each page.)  Each topic was
      processed separately.

      Training data:
         base run:
            very relevant: first-ranked from (b)
            relevant: second-ranked from (b)
            notrel: 6,000 pages selected at random
                      from full English collection
         relfeed runs: 
            very relevant: as per qrels
            relevant: as per qrels
            notrel: 6,000 pages selected at random
                      from full English collection
         Note:  very relevent examples were given
                    double weight (trained twice)

      Validation data:
          None.  This is an automatic run.  But we did
          compose 67 of our own queries that we used
          for pilot experiments.

      "Test" data:
          The classifier was run on the top 10K documents
          from (a) plus the top 10K documents from (b).
          Overall, the top-scored 1000 documents were
          submitted to NIST. 

P.S. Yes, indeed, we used spam filtering methods.  The
logistic regression was modified for speed and to process
50 topics simultaneously.  The naive bayes was an 
unmodified spam filter, run using the TREC spam filter
toolkit.  We knew from previous experiments that Naive
Bayes was more robust to training noise than logistic
regression; this seemed to be confirmed in our pilot
expermients.  That's why we used it for the relevance
feedback pass.