Using LLM in testing
October 26, 2023
Is LLM a possible partner for testing?
I’ve been reading a bunch of technical whitepapers recently. They are useful to see the history of technologies we use everyday, and to get deeper into their design.
I’ve also been playing around with LLM/ChatGPT. I wanted to see how far I could get in testing a website. Once the browser extension came out, I was eager to give it a try. I gave it my website to test. I had to prompt the language and tool (python/playwright), and wanted to see how it did. Results? It was just ok. It came up with the boilerplate just fine, but didn’t use the customary syntax. It wanted to test a login, but I don’t have a login. With several more prompts to correct it, I got a pretty fine, albeit simple response. Much like a junior programmer.
I was excited to come across this whitepaper on arvix.org:
Zhe Liu, Chunyang Chen, Junjie Wang, Mengzhuo Chen, Boyu Wu, Xing Che, Dandan Wang, Qing Wang Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions arXiv:2310.15780 [cs.SE]
The authors were able to make a much more successful attempt at using LLM in tests than my simple experiment. I suggest reading the whitepaper but in essence the authors create a framework for establishing the GUI information, the LLM prompts, and saving test results. Their results were much better than current testing tools available now.
I have some quick thoughts on this.
Will it scale? They used mobile apps, but webpages could be much more complicated. Dynamic web pages with lots of elements, some of which may appear under certain conditions, could complicate this approach.
How would this work with image processing? I think this would be fascinating to try. If you want to get more like a user, a typical user will look at a website and perform actions. They will not know about the elements on a page. If LLM with image processing works, that is so much more like a real user.
In all, I was impressed by this whitepaper and am excited to see the future on LLM and how it impacts testing.