Making Sense of High-Dimensional Data and Visualizations
from 15:00 to 16:00
|Add event to calendar||
Dr. Alyssa Goodman (Harvard University)
The "data deluge" in science is old news. Now, it's pouring, and we need working tools to collect, sort out, understand, and keep what is falling down on us.
In astronomy, the greatest insights very often come from studies where more than one "band" of data (e.g. optical, infrared, radio, X-ray) is combined. And, data sets aren't just large--they are often also high-dimensional, in that they contain information about flux as functions not just of position on the sky, but also as functions of a third dimension (e.g. frequency, velocity), and/or of time. Life science, geophysical, and geospatial data all present similar challenges.
In this talk, I will focus on examples drawn from my group's research on star formation in molecular clouds. In particular, I will show how new visualization and statistical analysis techniques relying on interactive high-dimensional views of data and on automated algorithms for "segmenting" data give new insight. "Segmentation" in imaging terms refers to extracting the meaningful structures from data, and I will show results from both dendrogram (tree-hierarchy) and machine-learning approaches. I will emphasize how the visualization of segmentation results is critical for understanding. The highlighted science results will show how we can now--for the first time--quantitatively but intuitively understand the connections between the "real" (position-position-position) space where simulations (e.g. of star formation) can be made and the "observational" (e.g. position-position-velocity) space available to earthbound astronomers. As a result of this newfound understanding, we can place important limits on the validity of virial-theorem-based calculations of the properties of gas--allowing, for example, for better estimates of which gas in star-forming regions is most likely to stay bound long enough to form stars.
Even though this abstract may sound technical to non-star-formation or non-computational researchers, my goal will be to keep the talk accessible to non-experts, so people from other fields faced with high-dimensional data and visualization challenges should feel free to join in--and to ask questions!