
Figure 1
An example collage of images from the Galaxy Zoo: Weird & Wonderful (GZ:W&W) project that have been discussed in the Talk boards and their corresponding volunteer-provided tags.

Figure 2
Left panel: The anomaly score versus the fraction of times a volunteer identified a subject as interesting in the Galaxy Zoo: Weird & Wonderful (GZ:W&W) (volunteer chosen fraction), with an upweighting of selections by experienced volunteers who have substantial participation in GZ:W&W, with the subset of those subjects discussed in Talk boards (red points). Right panel: The feature score versus image score for our entire ~200,000 GZ:W&W sample color-coded by the GZ experienced volunteer response weighted chosen fractions, respectively.

Figure 3
The probability distribution of the feature scores (left) and image score (right) for our entire sample (black bars, 99 percentile value in black dashed line) along with the subset that have weighted chosen fraction >0.5 (green bars).

Figure 4
A visualization of the GZ:W&W: Galaxy Zoo: Weird & Wonderful subjects in the three prominent UMAP: Uniform Manifold Approximation and Projection dimensions, color coded by different quantitative metrics: anomaly score (panel a), image score (panel b), feature score (panel c). We also show those subjects that were #tagged in the “Talk” discussion boards (see legend; panel d; WC: #white_dwarf, SN: #supernova_candidates).

Figure 5
UMAP: Uniform Manifold Approximation and Projection distribution of a subset of images validated with our logistic regression decision boundary (panel a) and those images that were chosen as satisfying the decision criteria (black points; panel b), respectively. Panel c shows the P: precision, R: recall, and the F1 score (2PR/P+R) as a function of an applied lower-limit on the feature score: Sfeature. Panel d shows the precision vs recall of various logistic regression decision boundaries where each of the three parameters are incrementally thresholded: weighted chosen fraction (red points), feature score (blue crosses), and a product of feature score and weighted chosen fraction (green triangles).

Figure 6
The precision and recall of the logistic regression decision boundary derived by varying the binarizing threshold: Σf for two different scores: weighted chosen fraction (left panel; ΣCF) and the product of feature score and weighted chosen fraction (right panel; ΣCF × Feature). In each panel, we also show the overall fraction of a new sample of images that requires visual inspection as a function of Σ.
