My research, in one way or another, is centered on machine learning and information retrieval. Most of my work
has involved adapting and applying these techniques to the software engineering domain and bio/chemical
informatics. I collaborate closely with the Baldi lab at UCI.
I enjoy collaborating with ambitious and enthusiastic undergraduates who are eager to perform original research.
I’m a firm believer in allowing undergrads to co-author research papers with me, and have had several papers
published with my advisees. My students include:
Publications (* denotes equal contributors)
Internet-Scale Software Repositories. Data Mining and Knowledge Discovery. Volume 2, Number 18. April 2009. (online)
J. Chen, E. Linstead, S. Swamidass, D. Wang, P. Baldi. “ChemDB Update: Full-text Search and Virtual
Chemical Space” Bioinformatics. Volume 23, Number 17. September 2007. (online)
E. Linstead, L. Hughes, C. Lopes, P. Baldi. Information-Theoretic Metrics for Project-Level
Scattering and Tangling. International Conference on Software Engineering and Knowledge
E. Linstead, P. Baldi. Mining the Coherence of GNOME Bug Reports with Statistical Topic Models.
MSR 2009: Proceedings of the Sixth Working Conference on Mining Software Repositories.
J. Ossher, S. Bajracharya, E. Linstead, P. Baldi, C. Lopes. SourcererDB: An Aggregated Repository
Of Statically Analyzed and Cross-Linked Open Source Java Projects. Proceedings of the Sixth Working
Conference on Mining
Evolution. Proceedings of ICMLA 2008: International Conference on Machine Learning and
P. Baldi*, C. Lopes*, E. Linstead*, S. Bajracharya. A Theory of Aspects as Latent Topics.
OOPSLA 2008. Nashville, TN. October 2008. (online)
E. Linstead, P. Rigor, S. Bajracharya, C. Lopes, P. Baldi. Mining Internet-Scale Software
Repositories. Advances in Neural Information Processing Systems (NIPS*2007)
March 2008. (online)
E. Linstead, P. Rigor, S. Bajracharya, C. Lopes, P. Baldi. Mining Concepts from Code with
Probabilistic Topic Models. Proceedings of ASE 2007: International Conference on Automated
E. Linstead, L. Hughes, C. Lopes, P. Baldi. Software Analysis with Unsupervised Topic Models.
NIPS Workshop on Application of Topic Models: Text and Beyond. Neural Information
Processing Systems (NIPS 2009). Whistler, B.C. December 2009. (online)
E. Linstead, L. Hughes, C. Lopes, P. Baldi. Exploring Java Software Vocabulary: A Search and
Mining Perspective. Proceedings of
– Users, Interfaces, Tools, and Environments.
E. Linstead, P. Rigor, S. Bajracharya, C. Lopes, P. Baldi. Mining Eclipse Developer Contributions via
Author-Topic Models. Fourth International
Workshop on Mining Software Repositories.
MN. May 2007. (Voted best paper, MSR “Scale” Challenge). (online)
41st ACM Technical Symposium on Computer Science Education, SIGCSE 2010.
L. Hughes, P. Baldi, E. Linstead. The Evolution of Concerns, Scattering, and Tangling in Eclipse and
ArgoUML. Third International Symposium on Empirical Software Engineering and Measurement.
E. Linstead, L. Hughes, C. Lopes, P. Baldi. Capturing Java Naming Conventions with First-Order Markov Models.
ICPC 2009: Proceedings of the Seventeenth International Conference on Program Comprehension.
S. Bajracharya, T. Ngo, E. Linstead, Y. Dou, P. Rigor, P. Baldi & C. Lopes. Sourcerer: A Search Engine for Open
Supporting Structure-Based Search. OOPSLA ’06 Poster
S. Bajracharya, T. Ngo, E. Linstead, P. Rigor, Y. Dou, P. Baldi & C. Lopes. A Study of Ranking Schemes in
Internet-Scale Code Search. UCI ISR Technical Report # UCI-ISR-07-8. Nov. 2007 (online)
Recent Invited Talks:
Searching and Mining Internet-Scale Software Repositories.
AI and Machine Learning Seminar. Dept. of Computer Science. UCI. November 10, 2008.
Google Tech Talk.
CPSC 229: C/C++ Programming (Fall 2004)
CPSC 229: Intermediate OO Programming (Interterm 2010)
CPSC 230: Computer Science I (Fall 2010)
CPSC 231: Computer Science II (Spring 2010, Fall 2010)
CPSC 252: Computer Architecture I (Spring 2004)
CPSC 285: Social Issues in Computing (Spring 2005, Fall 2009)
CPSC 350: Data Structures (Fall 2003, Fall 2008, Fall 2010)
CPSC 360: Computer Graphics (Interterm 2005, Interterm 2007, Interterm 2009)
CPSC 370: Data Mining (Spring 2008)
CPSC 370: Advanced OO Programming (Interterm 2010)
CPSC 390: Artificial Intelligence (Fall 2003, Spring 2010)
CPSC 406: Algorithm Analysis (Spring 2009)
CPSC 408: Database Systems (Fall 2010)
CPSC 499: Individual Research (Interterm 2009, Spring 2009, Summer 2009, Fall 2009)
Association for Computing Machinery (Senior Member)
Special Interest Groups on Artificial Intelligence, Computer Science Education, and
Knowledge Discovery in Data
Association for the Advancement of Artificial Intelligence (AAAI)
PC Member: SUITE 2010
Last Updated: August 9th, 2010