Commit 665e2c63 authored by Wohlgemuth, Jason's avatar Wohlgemuth, Jason
Browse files

docs(paper): Add custom workflow section; update SoN; add future work bullets for development

parent af7622f0
Loading
Loading
Loading
Loading
+255 KiB
Loading image diff...
+1.26 MiB
Loading image diff...
+19 −4
Original line number Diff line number Diff line
@@ -29,16 +29,31 @@ bibliography: paper.bib
> Accessible Content Optimization for Research Needs (ACORN) applies standardization, automation, linked data, and institutional knowledge to research activity data (RAD) to draw insights that benefit multiple audiences and aims. ACORN is a command line multitool that creates analysis-ready data from RAD and can also run on remote continuous integration servers for shared RAD repositories. ACORN employs a set of automated processes for informing and/or enforcing defined content schemas to create standardized and highly structured data. Because of its standardized data source, ACORN easily applies computer automation to generate communication assets such as PDFs, PPTs, and web pages. Built using memory-safe Rust, ACORN is portable and accessible for use on any Windows, Mac, or Linux machine.

# Statement of need
Communicating research can be difficult — from the high-level scope of a science-focused organization, down to singular projects within that organization. Researchers are asked to communicate their research, limited by their lacking skills in professional communication. Science communicators are asked to promote research, limited by their understanding of complex contexts and domain-specific details. Finally, research communication is further complicated by a lack of standardization in research data and metadata, preventing external audiences, such as jobseekers, policymakers, funders, and the general public from finding the information they need.
Communicating research can be difficult — from the high-level scope of a science-focused organization, down to singular projects within that organization. Science data systems created to help communicate research are often isolated and/or specialized to individual suborganizations, teams, or domains. True innovation requires reusable systems that can standardize data across domain boundaries and serve as a nexus for scientists, developers, and communicators.

Research communication is further complicated by a lack of consistency and documentation in research data and metadata, preventing external audiences, such as jobseekers, policymakers, funders, and the general public from finding the information they need.

Traditional research practices are built on antiquated processes that have become old habits. These are particularly dangerous in an environment in which getting published is critical to career livelihood. Researchers may be tempted to do the bare minimum, skip steps, and pursue sensational or novel paths in the name of journal acceptance and gaining credibility. These practices have led to the twin reproducibility [@Baker: 2016] and replicability [@Camerer: 2018] crises.

Trustworthy research is hard work — harder than closed-model research. But automation and data architecture, enabled through ACORN, can make it easier.

ACORN can enable quick analysis of research project portfolios, allowing decision-makers to pick and pull solutions for execution, sponsor discussions, and mission applications. ACORN has three main outputs: analysis-ready data applicable to AI/ML research; target artifacts: from the ACORN-enabled content process that creates a single source of truth for research activity data from which users can generate content pieces; and understanding: maintaining data in the same format for programmatic analysis and enhanced understanding and better application of AI/ML practices. This collection of tools allows researchers to leverage the benefits of connected data and automate numerous tasks essential to science and communication.
ACORN can enable quick analysis of research project portfolios, allowing decision-makers to pick and pull solutions for execution, sponsor discussions, and mission applications. ACORN has three main outputs: analysis-ready data applicable to AI/ML research; target artifacts: from the ACORN-enabled content process that creates a single source of truth for research activity data from which users can generate communication artifacts; and understanding: maintaining data in the same format for programmatic analysis and enhanced understanding and better application of AI/ML practices. This collection of tools allows researchers to leverage the benefits of connected data and automate numerous tasks essential to science and communication.

# Research Activity Data Workflows
At the intersection of research and communications, research activity data describes an identifiable package of work involving organized, systematic investigation. ACORN helps capture, standardize, and analyze research at the project level, one organization at a time. 

![ACORN works with unique data schemas and applies automation to analyze, format, export, and download research activity data. Its outputs include 1) analysis-ready data: highly standardized data from the automated check process immediately applicable to AI/ML research; 2) target artifacts: communication pieces such as PDFs and web pages; 3) programmatic data analysis made easier through standardized, structured data.](./figures/acorn_input_output.png)

![Within an organization, ACORN allows research activity data to be documented, analyzed, and publicized - even for projects without publications. By entering at the project level, ACORN provides unique visibility into a new set of variables that incrementally begin to document the broader scientific ecosystem through widespread adoption.](./figures/acorn_workflow.png)

# Future Work
- nanopublications
- knowledgebase = ontology + instances
- linked data
- oakley (RAG, agents, etc.)
- expansion (e.g, DoD, other labs, etc.)

# Acknowledgment
This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the work for publication, acknowledges that the US government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the submitted manuscript version of this work, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the [DOE Public Access Plan](https://energy.gov/doe-public-access-plan).

# References
 No newline at end of file
> 🚧 Under construction
 No newline at end of file