Artificial Intelligence (AI) and the literature review process: Screening

Application of AI tools such as ChatGPT to searching and all aspects of the literature review process

Screening is the process of selecting studies for inclusion in a review. The process usually involves the removal of duplicate records using reference management software, checking the title and abstract to remove those studies clearly irrelevant, obtaining the full-text and applying pre-determined eligibility criteria to determine which studies should be included.

Screening to select studies for inclusion in the review is the most time-consuming aspect of the systematic literature review (SLR) process. It is also subject to error because it is a manual process. Researchers have therefore looked to speed up the process and make it more accurate. The most recent SLR on the topic (van Dinter et al., 2022) found 41 studies on automating the process, all of which involved the use of machine learning techniques (Naive Bayes and Support Vector Machine (SVM) algorithms) to do so.

 

AI tools for screening

AI tools have also been used to speed up the screening process. Syriani et al. (2023) have shown that ChatGPT can work at the same accuracy as these machine learning techniques without the need for training which gives it  (p. 2) "a realistic chance to revolutionize SLR automation". Syriani et al. (2024) then showed that ChatGPT could reach 82% accuracy when given five systematic literature review datasets. They concluded that LLMs were not ready to replace article screening by humans but offered promising solutions to assist reviewers in the screening process by, for example, discarding all the articles that ChatGPT has excluded.

You can tell the AI tool which records are definitely to be included in your selected studies. The AI tool can then be trained to determine which records should be included based on your decisions. The trained AI tool can then re-rank the records to improve the efficiency of the screening process. Selected studies for inclusion are added earlier in the screening process rather than randomly as with manual screening.

Applying an AI tool for screening has several advantages (Thomas et al., 2024):

  • it ensures a consistent interpretation of subject matter, especially where the subject is complex and there is subjectivity in applying inclusion criteria even among experts; 
  • the tool’s responses enable you to clarify terminology and identify which specific phrases might lead to confusion.

They applied ChatGPT 3.5 to the screening process, asking it to assess 2917 studies on indicators of ecosystem condition for relevance based on their title and abstract. It completed this process in 10% of the time taken by expert reviewers. Studies were classified as selected, rejected or uncertain. The choices made by the tool were compared to those made by expert reviewers. One version of the prompt achieved 100% accuracy in selecting studies but did so at the expense of precision, selecting a higher proportion of irrelevant studies than other versions of the prompt. Using a prompt that emphasises and repeats key terms improves the performance of the generative AI tool. 

Despite the availability of tools to automate the title and abstract screening process, there has been little take up of these tools. One of the reasons is the need for any automation methods used in synthesizing evidence to be freely accessible and transparent to examination. Many of the 18 interviewees in this survey by Arno et al. (2020) emphasized that they were accountable to stakeholders who needed to be sure that information had not been missed and therefore needed to examine freely the methods used. 

Rayyan AI

Rayyan AI is a web-based automated screening tool, developed by Qatar Computing Research Institute (QCRI), which launched in 2014.

It uses text mining techniques to identify relevant information using statistical pattern learning that recognizes patterns in the data.

 

Rayyan AI

Using Rayyan

First steps
  • Sign up for an individual account with Rayyan.
  • Check your browser compatibility using https://rayyan.ai/check
  • Select My Reviews.
  • Then select New review.
  • Give the review a title, select your research field from the drop-down menu, select the review type, select the review domain which is similar to research field, give an optional description and press Create.

This review will now appear in the My Reviews tab when you enter Rayyan.

 

Exporting references

Rayyan then presents you with an option to import records directly from Mendeley.

For all other reference management software, such as EndNote or Zotero, select Upload references. These need to be in RIS format. You are best to export your references from your reference management software in RIS format.

You will find including the abstract is helpful.

 

Importing references
  • Remove any duplicates before you import into Rayyan.
  • Then choose Select files.
  • Find your file in your downloads folder.
  • Click upload then Continue. 

You can add new records to your review by selecting the New Search button top right.

 

Collaboration

The tutorial video from the Rayyan HelpCentre shows how this works in practice and includes how the collaboration functions work.

You can invite colleagues to join you in the review:

  • Click All reviews, then select the review and click invite.
  • Your collaborators will then receive an email and can see the review in the Collaboration reviews tab.

 

Including and excluding articles

Your articles, once imported into Rayyan, appear as undecided in the inclusion decisions box (facet):

  • Select each article being screened.
  • Decide whether to include or exclude each article.
  • When including an article your name will appear beside the article in green.
  • When excluding an article, select the Reason option and choose from the list of pre-populated reasons.
  • You can create new reasons for excluding. Your article will then be marked as excluded and a label with your exclusion reason is added to the article in red.
  • You can select I or E or M as keyboard shortcuts.
  • Use shift and up/down arrows to select multiple references. 

Rayyan keeps a count of numbers of articles tagged with each exclusion reason. Once you have made 50 decisions, you can click on Compute ratings. Rayan's artificial intelligence engine will then compute the probability of each of the remaining records being included or excluded based on the decisions you have already made. Each undecided rating will receive a rating of 1 to 5 where 5 is the most likely to be included.

You can filter by Undecided to show the references where you did not make the decision. The more decisions you make the more accurate Rayyan's AI engine will be as it learns more from each decision.

 

Evidence

Harrison et al. (2020) reviewed 15 software tools that used machine learning techniques to support abstract and title screening. They would recommend Covidence and Rayyan to systematic reviewers looking for suitable and easy to use tools for screening. The medical researchers included in the survey were all involved with systematic reviews. There were six tools which performed best, all scoring higher than 75% in the feature analysis, and these were included in the user survey. The remaining tools were DRAGON, AbstrackR, Colandr and EPPI-Reviewer.

Rayyan speeds up the screening process of selecting which studies to include in a systematic review. In experiments on a set of 15 reviews, users reported time savings in the order of 40% on average compared to previous tools they had been using. Rayyan’s two most important features compared to its competitors are its abstract and title screening assistance and opportunity for authors to collaborate on the same review (Ouzzani et al., 2016).

Valizadeh et al. (2023) compared the use of Rayyan with human reviewers in the manual screening of >2000 records from three systematic reviews. This was done in four stages. At the end of each, Rayyan was used to predict the eligibility score for the remaining records. Rayyan assigns a star rating to each record. Rayyan proved a reliable tool for excluding ineligible records at the threshold of <2.5 stars for exclusion. The findings were confirmed by Dos Reis et al. (2023) who concluded that the use of software to screen titles did not remove any that should have been included.

The three tools used for comparison: Rayyan, AbstrackR and Colandr (Cheng et al., 2018) were valuable resources to facilitate the screening process. Rayyan® provided the best scores in the objective evaluation and also rated highest as the most user friendly software according to the raters.