Create your own conference schedule! Click here for full instructions

Abstract Detail

Bioinformatic and Biometric Methods in Plant Morphology

Tcheng, David [1], Nayak, Ashwin [2], Punyasena, Surangi [3].

A rigorous test of the relative performance of automated pollen identification and human expert abilities.

We present a system for high-throughput pollen identification and classification, designed as a model for future supervised palynological identification systems. The system combines robotics, machine learning (ML), and supercomputing. A robotic scanning microscope (Nanooomer, designed for pathology imaging) was used to scan batches of pollen slides at high magnification (400x), across the entire area of the pollen sample and with 65 axial z-planes with at 1 micron intervals. Uncompressed image data was acquired off the microscope at a rate of 70 MB/s, with a fully imaged slide sized at about ~400 GB. We tested the abilities of the system against 22 slides with known black and white spruce pollen ratios, which were manually constructed using pollen samples from trees identified in the field. The training slides for ML contained a large number of pollen from two individuals for each of the species. The testing slides were constructed with known ratios of black and white spruce pollen, using three unique individuals per species. The goal of ML was to predict these ratios given only the training slides. For improved efficiency, the problem was decomposed into two subproblems: (1) learning to identify all regions of the slide image likely to contain pollen (pollen spotting) and (2) predicting which species of pollen is present in these given regions (pollen classification). We compared the performance of different ML algorithms including simple (linear discriminate functions) and complex ML methods (instance based and decision trees) for solving these two types of ML problems. We used the Lonestar and Stampede supercomputers (NSF EXSEDE supercomputers) for processing our 5.9 TB of image data and doing the ML algorithm parameter optimization. Performance was measured in terms of accuracy, speed, and model size. In our final, ongoing analysis, we are comparing human and machine performance.

Broader Impacts:

Log in to add this item to your schedule

1 - University of Illinois, Illinois Informatics Institute, 1205 West Clark Street, Urbana, IL, 61801, USA
2 - University of Illinois, School of Integrative Biology, 505 S Goodwin Avenue, Urbana, IL, 61801, USA
3 - University of Illinois, 505 S Goodwin Ave, 139 Morrill Hall, Urbana, IL, 61801, USA

machine learning
machine vision

Presentation Type: Symposium or Colloquium Presentation
Session: C2
Location: Prince of Wales/Riverside Hilton
Date: Monday, July 29th, 2013
Time: 3:30 PM
Number: C2008
Abstract ID:897
Candidate for Awards:None

Copyright 2000-2012, Botanical Society of America. All rights reserved