Creating a “TLDR” for Knowledge Workers
Every morning, the president of the United States receives a report summarizing the day’s most important national security information. The President’s Daily Brief is produced through the collaboration of many members of the U.S. intelligence community and is specifically tailored to what the president needs to know at that moment. Like the president, many knowledge workers—such as academics, intelligence analysts, or anyone who analyzes and adds value to data for a living—also need to be able to process massive amounts of information and prioritize the right knowledge at the right time. Unlike the president, however, they do not have a community of professionals who can create such reports for them every day. But what if advances in the fields of machine learning and artificial intelligence could be utilized to create not just one, but tens of thousands of tailored daily reports, or “TLDRs,” that are customized to the needs of many individuals?
To find out, the Laboratory for Analytic Sciences (LAS), a collaboration between NC State and the National Security Agency, brought together a diverse, interdisciplinary group of 40 researchers from academia, industry, and the intelligence community for its inaugural Summer Conference on Applied Data Science (SCADS) from June 13 to August 5. The overarching, multi-year challenge of SCADS is to generate TLDRs for knowledge workers that capture information relevant to their individual objectives and interests.
This year, conference participants focused primarily on two issues related to the grand challenge: (1) conceptualizing the content, format, user interactions, and automated analysis of a TLDR; and (2) investigating ways to produce tailored summaries of information from one or more large-scale, continually-evolving multimodal datasets. Participants spent time learning about the problem domain and datasets, then collaborated in four groups to focus on the problems most interesting to them: text summarization, knowledge graphs, recommendation algorithms, and human-computer interaction.
“SCADS ties in with several of the lab’s strategic goals, including human analyst-machine collaboration, data triage solutions that assist analysts in prioritizing information, and increasing collaboration between government, academia and industry,” says Amy Brown Gagnon, director of LAS. “This event was made possible thanks to the team efforts of all our LAS staff.”
Guest speakers included subject matter experts like government employees who shared examples of their workflows, current industry collaborators who spoke about the benefits of working with LAS, and Senator Richard Burr (R-NC), who spoke about the impact that innovation and technology can have on government.
On the last day, groups presented the results of their research, including an implementation of an explainable neural recommender system, exploratory data analysis of multiple data sets to determine applicability for different use cases, and multiple methods proposed for identifying new information within a temporal data set.
“This program provides practical data science training, networking opportunities, and access to diverse perspectives.”
LAS plans for SCADS to be an annual immersive collaboration experience that combines data scientists, intelligence analysts, researchers and software engineers. This year’s 40 attendees included students, faculty, industry partners and government. They represented 7 universities (NC State, Florida, Arizona State, Arizona, Penn State, UC Berkeley, Smith College). One of the industry participants was from the United Kingdom.
“This program provides practical data science training, networking opportunities, and access to diverse perspectives,” says Liz Richerson, a SCADS participant who specializes in computational linguistics research. “This year’s group laid the bricks upon which future SCADS researchers can build as we progress toward our goal of creating quick, custom intelligence reports for analysts.”