Generative AI Tools in Research
Commercial generative AI tools have become increasingly popular in a research context, as their capabilities have been expanding.
This document is to provide guidance as to their use in a research setting. For general use cases, refer to the Guidance on using Generative AI Tools by Berkeley Lab Cybersecurity.
What You Need to Know
- As with all tools and utilities, researchers bear the responsibility for the use of Generative AI Tools in a research setting and its incorporation into the research process and the reporting of results. Their use in research or research reporting must be acknowledged as appropriate.
- As there are no negotiated agreements with the providers of these tools, confidential, proprietary or sensitive data cannot be entered into these systems.
- The output from generative AI tools can look like real facts are stated, when they are not. This can impact the integrity of your research.
Processing Research Data
- Only submit non-sensitive/public information into these tools, as their use is not covered by negotiated agreements (see Berkeley Lab Cybersecurity Guidance), and the data may subsequently be used to train the AI tool, and present back to other users.
- Proprietary, export-controlled or otherwise sensitive data, including personally identifying information or medical information, cannot be entered into these tools under any circumstances.
- Using generative AI tools to create processing scripts or for any data processing or summarizing needs to be applied very carefully, given a possibility of such tools to create incorrect data. Any outputs from a generative AI tool should be checked for validity.
- Research should be reproducible and replicable, so any processing using generative AI tools should be logged.
Reporting Research Outputs
- The use of generative AI tools in reporting needs to be applied very carefully, given the possibility that such tools can create incorrect data, known as “hallucinations.” This includes literature references, and such tools have been known to fabricate incorrect references. Ultimately, the authors of published research assume responsibility for any content published from generative AI tools and are responsible for its accuracy.
- Generative AI Tools cannot assume responsibility for published research outputs, and therefore cannot be listed as authors. Instead, the use of generative AI tools should be declared in the methods section, similar to the use of other tools. Their use should be considered carefully. Some journals require a detailed disclosure of how AI tools are used in research work and in the preparation of journal articles.
- The use of generative AI tools for the creation of images or other visual representations should only be considered if scientifically necessary and should be declared in their caption. You must ensure that you have a license to reproduce such images, and that this license is compatible with the publisher’s license for the research output. Some publishers prohibit the use of figures created by generative AI tools.
- As noted above, sensitive, proprietary or unpublished data cannot be submitted to generative AI tools.
Peer Review
- The review of funding proposals or submitted publications is a confidential process, and generative AI tools cannot be used for peer review activities. Their use would be a breach of confidentiality. Funding agencies such as NIH explicitly prohibit their use for this purpose.
Questions
In case of questions around the use of Generative AI Tools in a research setting, please contact the Research Compliance Office at rco@lbl.gov.