This story was originally published by CalMatters. Sign up for their newsletters.
California government agencies are going all-in on generative artificial intelligence tools after Gov. Gavin Newsom’s 2023 executive order to improve government efficiency with AI. One deployment recently touted by the governor is a chatbot from the California Department of Forestry and Fire Protection, the primary agency tasked with coordinating the state’s wildfire response.
The chatbot, which Cal Fire says is independent of Newsom's order, is meant to give Californians better access to “critical fire prevention resources and near-real-time emergency information,” according to a May release from Newsom’s office. But CalMatters found that it fails to accurately describe the containment of a given wildfire, doesn’t reliably provide information such as a list for evacuation supplies and can’t tell users about evacuation orders.
Newsom has announced AI applications for traffic, housing and customer service to be implemented in the coming months and years. But Cal Fire’s chatbot issues raise questions about whether agencies are following best practices.
“Evaluation is not an afterthought,” said Daniel Ho, law professor at Stanford University whose research focuses on government use of AI. “It should be part of the standard expectation when we pilot and roll out a system like this.”
The chatbot uses the Cal Fire website and the agency’s ReadyForWildfire.org to generate answers. It can tell users about topics such as active wildfires, the agency, fire preparedness tips and Cal Fire’s programs. It was built by Citibot, a South Carolina-based company that sells AI-powered chatbots for local government agencies across the country. Cal Fire plans to host the tool until at least 2027, according to procurement records.
“It really was started with the intent and the goal of having a better-informed public about Cal Fire,” said Issac Sanchez, deputy chief of communications for the agency.
When CalMatters asked Cal Fire’s bot questions about what fires were currently active and basic information about the agency, it returned accurate answers. But for other information, CalMatters found that the chatbot can give different answers when the wording of the query changes slightly, even if the meaning of the question remains the same.
For example, an important way Californians can prepare for fire season is assembling a bag of emergency supplies should they need to evacuate. Only “What should I have in my evacuation kit?” returned a specific list of items from Cal Fire’s chatbot. Variations of the question that included “go bag,” “wildfire ready kit” and “fire preparedness kit” instead returned either a prompt to visit Cal Fire’s “Ready for Wildfire” site, which has that information, or a message saying “I’m not sure about the specific items you should have” and the wildfire site link. Two of those terms are present on the site the chatbot referenced.
And while the chatbot didn’t generate incorrect answers in any of the queries CalMatters made, it doesn’t always pull the most up-to-date information.
When asked if the Ranch Fire, a 4,293-acre fire in San Bernardino County, was contained, the chatbot said that the “latest” update as of June 10 showed the fire was 50% contained. At the time CalMatters queried the chatbot, the information was six days out of date – the fire was 85% contained by then.
Similarly, when asked about current job openings at the agency, the chatbot said there weren’t any. A search on the state’s job site showed two positions at Cal Fire accepting applications at the time.
Mila Gascó-Hernandez is research director for the University at Albany’s Center for Technology in government and has studied how public agencies use AI-powered chatbots. Two key factors she uses to evaluate such chatbots are the accuracy of information they provide and how consistently they answer the same questions even if the question is asked in different ways.
“If a fire is coming and you need to know how to react to it, you do need both accuracy and consistency in the answer,” she said. “You’re not going to think about ‘what’s the nice way to ask the chatbot?’”
Currently, the chatbot is unable to provide information about evacuation orders associated with fires. When asked who issues evacuation orders, it sometimes correctly said law enforcement, while other times said it didn’t know. Cal Fire’s Sanchez said it’s reasonable to expect the chatbot to be able to answer questions about evacuations.
If there are no evacuation orders for a particular fire, he said, “the answer should be ‘there doesn’t appear to be any evacuations associated with this incident.’”
Sanchez said he and his team of about four people tested the chatbot before it went out by submitting questions they expected the public to ask. Cal Fire is currently making improvements to the bot’s answers by combing through the queries people make and ensuring that the chatbot correctly surfaces the needed answer.
When CalMatters asked the bot “What can you help me with?” in early May, it responded, “Sorry I don’t have the answer to that question right now” and asked if CalMatters had questions about information on the Cal Fire site. By mid-June, that answer was updated to being able to “provide answers to questions related to information located on this page such as details about current fires, CAL FIRE job classifications, examination requirements and CAL FIRE's various programs.”
“The big message we want to get across,” Sanchez said, “is be patient.”
But experts said the process of kicking the tires on a chatbot should happen long before procurement begins.
The preferred process, Stanford’s Ho said, is to establish criteria for how the chatbot should perform before a vendor is selected so there are clear benchmarks to evaluate the tool. Ideally, those benchmarks are created by an independent third party. There should also be an evaluation of the benefits and risks before the chatbot is released.
And in a best-case scenario, the public would be involved before launch, Albany’s Gascó-Hernandez said. Agencies interested in using chatbots should identify the questions the public is likely to ask the AI tool ahead of time, ensure those are representative of the expected population the agency serves and refine the chatbot by having members of the public pilot the system to ensure it provides the kind of information they seek.
“These user engagement and user experiences are very important so the citizen ends up using the chabot,” she said.
This article was originally published on CalMatters and was republished under the Creative Commons Attribution-NonCommercial-NoDerivatives license.