Although LLMs offer a lot of functionality over a vast area of application, these models are far from perfect. When using LLMs, whether personally or for scientific research, it is of vital importance you know what the limitations and risks are when using LLMs.
The main take-away is: LLMs make mistakes!
More to the point:
- The initial (and possibly subsequent) datasets that are used to train the LLM may be biased. This will have the LLM return results that are possibly missing information or plainly false.
- Conclusion: be aware of the inclusions and exclusions of data in an LLM especially on the domain for which you are using the LLM.
- The dataset(s) an LLM is trained on can incorporate legal issues and privacy violations. Using that data might infringe on copyright or other legal rights of the owners of the data.
- Conclusion: always check the LLM responses for any possible legal infringements and do not use any such data.
- Because an LLM ‘aims to please’ it can try to give an answer even if it does not know the answer. These hallucinations look very real but are in fact that; hallucinations.
- Conclusion: always check the responses through other sources to make sure the information is factual.
- Although efforts are underway to mitigate this, LLMs can be unethical. They can – on purpose! – create false information based on lacking datasets or misguided prompting.
- Conclusion: always use Common Sense when interpreting LLM responses and check using other resources.
- Not all the models have access (through Web Searches) to real-time information. Their information is usually at least 1 year old. So, especially the older and simpler models are not reliable to find and process information that is more recent.
- Conclusion: if you need up-to-date information available for the LLM use any of the modern LLMs that offer Web Search by default or as an option.
So, use LLMs but be aware of their limitations and do not rely solely on the response of an LLM.