A side-by-side evaluation of Llama 2 by meta with ChatGPT and its application in ophthalmology

Masalkhi, Mouayad; Ong, Joshua; Waisberg, Ethan; Zaman, Nasif; Sarker, Prithul; Lee, Andrew G.; Tavakkoli, Alireza

doi:10.1038/s41433-024-02972-y

Download PDF

Comment
Open access
Published: 12 February 2024

A side-by-side evaluation of Llama 2 by meta with ChatGPT and its application in ophthalmology

Mouayad Masalkhi ORCID: orcid.org/0000-0003-0811-358X¹,
Joshua Ong ORCID: orcid.org/0000-0003-4860-827X²,
Ethan Waisberg³,
Nasif Zaman⁴,
Prithul Sarker⁴,
Andrew G. Lee^{5,6,7,8,9,10,11,12} &
…
Alireza Tavakkoli⁴

Eye (2024)Cite this article

961 Accesses
2 Citations
1 Altmetric
Metrics details

Subjects

Introduction

Llama 2, a product of Meta, represents the latest advancement in open-source large language models (LLMs). It has been trained on a massive dataset of 2 trillion tokens, which is a significant increase compared to its predecessor, Llama 1 [1]. Its ability to understand and generate language, combined with its optimized transformer architecture and fine-tuning methods, make it a potentially valuable resource in various fields.

Llama 2 is available in a variety of sizes, with parameters ranging from 7 billion to 70 billion, and includes both pretrained and fine-tuned versions [1]. The fine-tuned models, known as Llama 2-Chat, have been optimized for dialogue applications [1]. The Llama 2 model uses an optimized transformer architecture, which is a network architecture based solely on attention mechanisms [1].

The architecture of Llama 2 is based on an optimized transformer model, a network architecture that relies solely on attention mechanisms [1]. This allows the model to focus on different parts of the input sequence when generating an output, thereby enhancing its language understanding and generation capabilities [1].

In the field of ophthalmology, LLMs are approaching expert-level knowledge and reasoning skills and have the potential to provide valuable medical advice and assistance in areas where access to expert ophthalmologists is limited [1]. Another notable application of Llama 2 is Code Llama, a code generation model built on Llama 2, which has been trained on 500 billion tokens of code [1].

We prompted both the AI chatbots that a patient was reporting a sensitivity to light in both eyes, and if they should attend the emergency department (Fig. 1). Both Llama 2 and ChatGPT correctly recommended to attend the emergency department and the importance of seeking urgent medical care. We found that the response generated by Llama 2 focused more on ways of alleviating the pain, which ChatGPT focused more on the different causes of photophobia. These AI-generated outputs were both specific, and appropriate (Fig. 2).

**Fig. 1: A Side-by-Side Evaluation of Llama 2 by Meta with ChatGPT and Its Application in Ophthalmology. Last revision on November 21, 2023.**

**Fig. 2: A Side-by-Side Evaluation of Llama 2 by Meta with ChatGPT and Its Application in Ophthalmology. Last revision on November 21, 2023.**

In addition, we prompted both AI chatbots what to do if a patient is seeing that lines appear wavy in one eye (Fig. 3). Generating a suitable suggestion from this input is challenging as the reason for this blurry vision can be a result from refractive errors, age-related conditions, eye injuries, eye infections, or more serious conditions.

**Fig. 3: A Side-by-Side evaluation of Llama 2 by Meta with ChatGPT and its application in ophthalmology. Last revision on November 21, 2023.**

Llama 2 provided a comprehensive list of steps and actions to performs to address the possible reason for seeing blurry lines. ChatGPT mentioned the importance of seeking professional medical help as soon as possible, especially if sudden or severe vision changes are experienced. ChatGPT’s response focused more on the different causes for those symptoms such as issues in the retina, optic nerve, a migraine aura, among others,

Finally, we examined Llama 2’s image analysis capabilities by providing a fundus image of glaucomatous cupping in the right optic nerve (Figure 5). [2]. Llama 2 stated that it is not a medical expert prior to providing any information. Llama 2 correctly identified the retina and blood vessels, however, did not specify the optic disc or make any remark on the optic disc cupping. GPT-4 on the other hand correctly identified the optic disc, retina, and the blood vessels. However, GPT-4 incorrectly mentioned that this image appears to be normal (Fig. 4).

**Fig. 4: A Side-by-Side evaluation of Llama 2 by Meta with ChatGPT and its application in ophthalmology. Last revision on November 21, 2023.**

Limitations

The Llama 2 model, a powerful LLM renowned for its natural language processing prowess, has been applied in various fields, including medicine and ophthalmology. However, like any other AI model, it has its limitations, which can impact its effectiveness and applicability in these fields.

One of the significant limitations of the Llama 2 model is its immense computational requirements. The model’s massive neural network architectures, with billions of parameters, demand state-of-the-art hardware and extensive resources for training and fine-tuning. This makes it inaccessible to many individuals and smaller organizations with limited computing power.

Conclusion

While the Llama 2 model has shown promise in various applications, including medicine and ophthalmology, it is essential to consider its limitations. These include its high computational requirements, lengthy training time, potential for bias, limitations in handling non-English languages, inferior coding abilities, and unclear training dataset. Further research and development are required to address these limitations and enhance the model’s effectiveness and applicability in medicine and ophthalmology.

References

Meta AI [Internet]. [cited 2023 Oct 31]. Llama 2. Available from: https://ai.meta.com/llama-project.
Waisberg E, Micieli JA. Neuro-ophthalmological optic nerve cupping: An overview. Eye and Brain.

Download references

Funding

Open Access funding provided by the IReL Consortium.

Author information

Authors and Affiliations

University College Dublin School of Medicine, Belfield, Dublin, Ireland
Mouayad Masalkhi
Department of Ophthalmology and Visual Sciences, University of Michigan Kellogg Eye Center, Ann Arbor, MI, USA
Joshua Ong
Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, Reno, NV, USA
Ethan Waisberg
Department of Ophthalmology, University of Cambridge, Cambridge, UK
Nasif Zaman, Prithul Sarker & Alireza Tavakkoli
Center for Space Medicine, Baylor College of Medicine, Houston, TX, USA
Andrew G. Lee
Department of Ophthalmology, Blanton Eye Institute, Houston Methodist Hospital, Houston, TX, USA
Andrew G. Lee
The Houston Methodist Research Institute, Houston Methodist Hospital, Houston, TX, USA
Andrew G. Lee
Departments of Ophthalmology, Neurology, and Neurosurgery, Weill Cornell Medicine, New York, New York, USA
Andrew G. Lee
Department of Ophthalmology, University of Texas Medical Branch, Galveston, TX, USA
Andrew G. Lee
University of Texas MD Anderson Cancer Center, Houston, TX, USA
Andrew G. Lee
Texas A&M College of Medicine, Texas, USA
Andrew G. Lee
Department of Ophthalmology, The University of Iowa Hospitals and Clinics, Iowa City, IA, USA
Andrew G. Lee

Authors

Mouayad Masalkhi
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Ong
View author publications
You can also search for this author in PubMed Google Scholar
Ethan Waisberg
View author publications
You can also search for this author in PubMed Google Scholar
Nasif Zaman
View author publications
You can also search for this author in PubMed Google Scholar
Prithul Sarker
View author publications
You can also search for this author in PubMed Google Scholar
Andrew G. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Alireza Tavakkoli
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.M: Literature Review and Writing. J.O: Manuscript Review and Editing. E.W: Manuscript Review and Editing. A.G.L: Manuscript Review and Editing.

Corresponding author

Correspondence to Mouayad Masalkhi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics approval and consent to participate

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Masalkhi, M., Ong, J., Waisberg, E. et al. A side-by-side evaluation of Llama 2 by meta with ChatGPT and its application in ophthalmology. Eye (2024). https://doi.org/10.1038/s41433-024-02972-y

Download citation

Received: 24 November 2023
Revised: 10 January 2024
Accepted: 29 January 2024
Published: 12 February 2024
DOI: https://doi.org/10.1038/s41433-024-02972-y

This article is cited by

FFA-GPT: an automated pipeline for fundus fluorescein angiography interpretation and question-answer
- Xiaolan Chen
- Weiyi Zhang
- Mingguang He
npj Digital Medicine (2024)
Ethical Considerations of Neuralink and Brain-Computer Interfaces
- Ethan Waisberg
- Joshua Ong
- Andrew G. Lee
Annals of Biomedical Engineering (2024)