A small proportion of the LAS portfolio is comprised of “advanced concepts” projects that align with our primary themes (Structured Analytic Tradecraft, Analytics, Sensemaking). We provide brief descriptions of several areas of interest, although other proposals related to the themes are encouraged.
Collaboration: LAS offers a unique opportunity for studying the interaction and collaborations of academics, government researchers and analysts, and industry performers. LAS employs an engaged scholarship model to both study and improve collaborations. Key areas of investigation include:
- Technology- and tool-enabled collaboration
- Optimizing interdisciplinary organization performance
- Methods and techniques for enabling collective sensemaking and problem solving
- Collaborative communication and decision making
Open Source Analysis: Analyses using openly available data (e.g., the Internet, press, television, video, photos, social media) raises a variety of issues that are different from analyses conducted using data from, for example, designed scientific experiments. Key areas of investigation include:
- Data quality and veracity
- Data readiness, including temporal aspects like the “half life” of data and the “value” of data at a particular point in a processing flow
- Knowledge management
- Development of repeatable, scalable analysis processes and supporting technology
Analytic Integrity for Machine Learning: Machine learning model quality is evaluated through a mix of measures including accuracy, precision and recall, generalization error, and model complexity. While continual verification and validation and refinements to models are widely recognized as best practices, resource, process, and technology constraints can significantly impede efforts to maintain and update models. Without dedicated teams and/or technology tooling to support evaluation of model quality, we have a limited ability to detect and correct for model drift, refine models based on new data such as user interaction with current model results or new data labels, and improve the presentation of model results to users. Three core research areas of privacy, trust, and cost are essential factors for evaluating the deployment and sustainment of machine learning models. Those factors influence the viability of a model and the confidence the end user has in the results and its application.
Triage: Inexpensive storage and ubiquitous communication have led to many large publicly available datasets. Analyzing these corpuses requires knowledge of methods and a priori knowledge of what to look for that favors a “just-in-time” approach to data analysis. In contrast, an approach that pro-actively discovers possible goals (e.g., hypotheses), manages these goals, and identifies workflows applicable to a dataset would enable a prioritization and triage approach to data analysis tasks. Possible areas of interest include:
- Data-driven hypothesis generation
- Goal management systems
- Data-driven workflows