Using Open-Source Scripting Languages for Rapid-Development of Informatics Capabilities

Versek, Craig and Thorn, Michael and Tuominen, Mark T.. (2010) Using Open-Source Scripting Languages for Rapid-Development of Informatics Capabilities. In: Nanoinformatics 2010, November 3 - 5, 2010, Arlington, VA. (Unpublished)

cversek_Nanoinformatics2010_talk.pdf - Presentation

Download (22MB) | Preview


Our experimental physics group specializes in the fabrication and characterization of materials and nanoscale systems with novel physical properties. We frequently use both commercial and custom test equipment setups to conduct material properties measurements, which require long-running and repetitive control of multiple instruments. Such laboratory tasks have already benefited greatly from in-house developed computer automation, improving the reliability of data acquisition and freeing up scientists for more creative tasks. However, the analysis of the characterization measurements often involves repetitive calculation and plotting of numerous data sets. Large volumes of data become increasingly difficult to handle using manual “spreadsheet” analysis techniques; instead, researchers might write custom programs to batch process the raw data into more meaningful parameters and visualizations. The combined automation of data acquisition, processing, organization, and visualization is the hallmark of informatics systems which have proven successful in the fields such as biology and high energy physics. The current informatics paradigm relies on massive collaborative efforts developing highly specific hardware and software infrastructure to collect vast amounts of mostly homogeneous data. However, the bewildering heterogeneity and incompatibility of the software interfaces and data formats for the instrumentation used in material science and nanotechnology research present a significant challenge to the developers of experimental informatics systems in this field. Instead, we propose that small research groups can leverage the power of open source software and high-level scripting languages (e.g., Python, TCL, Perl, or Ruby) to knit together existing tools, each of which excel at various parts of the complete informatics package. “Open source” refers to software (and some hardware) products with licenses that protect the end users' rights to access and modify the original source materials, like human readable program code. Moreover, the open source paradigm encourages public collaborative development and sharing of knowledge, with the goals of producing interoperable and robust applications, typically free of charge. The above mentioned scripting languages are themselves open source software environments that are specifically designed to coordinate, or “script,” other components as well as perform other general purpose programming tasks, while promoting gains in developer productivity. In this talk, we present examples from actual applications developed in our lab. We describe our usage of agile software development methodologies and our selection of Python as one of the most promising programming languages for laboratory informatics. The emphasis will be on the rapid-development of custom software tools for scientific work-flow management, computer-assisted data processing, reduction, and enhanced visualization.

Item Type: Conference or Workshop Item (Other)
Uncontrolled Keywords: open source software, scripting languages, Python, agile software development,
InterNano Taxonomy: Informatics and Standards
Tool development
Collections: National Nanomanufacturing Network Archive > Conferences and Workshops > Nanoinformatics 2010
Depositing User: Rebecca Reznik-Zellen
Date Deposited: 23 Mar 2011 17:50
Last Modified: 23 Mar 2011 17:51

Actions (login required)

View Item View Item