background image

Delving into wildfires

author image

By Kumar Venkat

· 18 min read

After a deep dive into wildfire data and then building the machine learning (ML) models that are now part of FireVision, I am emerging with the strong belief that the rapidly growing wildfire problem is highly actionable if we have the right data and metrics. While there is an element of randomness to wildfires, there is also a method to the madness. Definitive patterns do exist in the wildfire data. These patterns not only let us build models to predict aspects of future wildfires but also allow us to think about how we might mitigate the risks of large wildfires in the coming years and decades.

I’ll start with some insights gleaned from the historical wildfire data of the last three decades and then transition into what the ML models are telling us about the future up until 2040. I’ll try to tell the story of wildfires — and touch on what we can do about them — using numbers and patterns in the form of charts and video animations.

Learnings from historical wildfires

We have an excellent record of wildfires in the United States between 1992 and 2020 from the USDA Forest Service. We’ll use the charts below to show how wildfires are continuing to evolve in the 11 contiguous western states which bear the brunt of wildfire damage in the US.

Wildfires can have natural or human causes. The primary natural cause is lightning, which we’ll come back to when we talk about ML models. There are a dozen different human causes ranging from open burning to malfunctioning power lines.

Natural causes accounted for 36% of all wildfires and 68% of the acres burnt in the western US in 2001–2010. However, in the next 10 years (2011–2020), natural causes accounted for only 23% of the fires and 57% of the acres. Human causes are now responsible for 77% of the fires and 43% of the acres (up from 64% and 32% respectively). The two charts below illustrate this, highlighting an important trend: Human activity is increasingly becoming the dominant cause of wildfires. This, of course, points to a way forward. While we can’t control lightning, we can influence human causes and potentially modify the trajectory of future wildfires.

Fire intensity (quantified as acres burnt per fire) has increased dramatically for natural fires which are more likely to occur in remote locations and at higher elevations. The doubling of fire intensity for natural fires (between 2001–2010 and 2011–2020) suggests that the warmer and drier climate in the summer months is pushing fires to be significantly larger and more difficult to contain. Human-caused fires have increased in intensity by about 50% and the average intensity remains at about a quarter of natural fires. One reason for this might be that human-caused fires generally occur near urban areas or farms where they can be more easily detected and put out.

One exception is the fires caused by power transmission lines and equipment — the intensity here is the highest among human causes and increasing (though still less than the intensity of natural fires), likely because power lines run through wilderness that is difficult to access compared to most other human-caused fires. That said, power transmission accounts for just 2% of the fires and 3% of the acres burnt (up from 1% and 2% respectively) even though these fires get outsized coverage in the media.

Is there a spatial pattern to the historical wildfires across the western states? Keep in mind that wildfires are always very low probability events. Even if the conditions are ripe — such as extended warm and dry spells with plenty of burnable fuel on the ground — we might still not see a wildfire. So the historical record doesn’t reflect the risk of wildfires and is relatively sparse spatially, but there is nevertheless a noticeable pattern there if we look at the data over a period of nearly 30 years from 1992 to 2020.

The distribution of fire sizes below from the FireVision Historical Maps shows that larger wildfires have generally occurred away from the coast and population centers, and likely at higher elevations . Most fires in Washington, Oregon and California have been small or medium, with the exception of southern Oregon, eastern Oregon/Washington and northern California. Parts of Nevada and Utah haven’t experienced any recorded fires at all as indicated by white spaces.

The distribution of fire causes below follows the fire sizes to some extent. The larger fires away from the coast and population centers and at higher elevations are also more likely to be natural-caused fires.

How have fire characteristics historically varied across a year? The next three charts (along with some additional data) highlight a few critical things about fires in the 2011–2020 period:

  • Small fires (< 100 acres) dominate. They actually account for nearly 97% of all wildfires. Just over 1% of fires are large (> 1000 acres).
  • Fires of all sizes peak in the summer months. August has historically been the peak month for fires, but the peak appears to be moving towards July.
  • Natural-caused fires (essentially ignited by lightning) peak in the summer months and are negligible in the October-April time frame.

Why machine learning?

Looking at the historical wildfire data and developing some intuition is useful, but not anywhere near enough to predict what might happen in the future at any given location. This is where machine learning comes in as a modeling technique. ML can be especially helpful in predicting things like wildfire characteristics where there are patterns arising from the complex interactions of many location-specific variables including:

  • Ground-level fuel
  • Canopy structure
  • Topography
  • Fire weather index (FWI) and its components for the recent past (based on past weather data) as well as for future years (based on highly granular climate projections)
  • Lightning stroke density and power per stroke

The interactions are complex enough that it is often difficult to identify a single variable (keeping all else constant) that can definitively determine the characteristics of a specific wildfire. Second-order and higher-order interactions rule. Deep neural networks in particular are ideally suited for modeling this type of problem, and that’s exactly what we ended up implementing. More on the ML methodology and modeling in the FireVision user guide.

Lessons from machine learning

Fire Risk Index

The screenshot below shows the distribution of the new lightning-induced fire risk index (LFRI) across the western states on a 20 km grid for the month of July right at the peak of the fire season, using average fire weather conditions from 2021–2023. This index combines the wildfire susceptibility (as measured by the potential fire size, see next section) and the lightning power density at any location for a particular time of the year. A higher LFRI implies a higher probability of a larger fire from natural causes.

The pattern seen below is quite similar to the pattern of actual historical wildfire sizes in the previous section: Larger risk index values occur away from the coast and population centers, and are clustered in the middle of the western states. Since this is an output of the predictive ML model, we are no longer dependent on just the historical record and there are no white spaces: We’ve evaluated all point locations on a 20 km by 20 km grid for illustration using the FireVision Risk Maps and every point in that space gets a risk value.

If that is the risk distribution at the peak of the fire season, how does the risk index vary across a year? The video clip below shows this from May to November using the average of 2021–2023 for fire weather data and average lightning climatology for 2010–2023. In May, which is typically just before the start of the fire season, the risk is already quite high in the eastern and southeastern portions of the western states. By November when the peak fire season is pretty much behind us, the risk looks extremely low.

The beauty of the ML model is that it can not only assess the fire risk at every single location in the space that we are interested in, but it can also look ahead into the future. As the climate changes and summers become warmer/drier, fire risk is going to increase. We’ve already seen a preview of this in the past decade.

The next chart depicts a systematic monthly view of the risk distribution at 7500 locations on a uniform 20 km grid across the 11 western states between now and 2040. The number of high-risk locations increases across the board as we move into the future, peaking at 21% of all locations in July 2040 (up from 10% in 2023).

Fire risk indexes like LFRI can be used in a range of applications including insuring/de-risking catastrophic wildfires, identifying locations for prioritized mitigation efforts such as forest thinning, and evaluating the permanence of nature-based carbon credits.

The next two video clips show an animated evolution of the risk index values between now and 2040 for the months of May and July in the western US.

One last point here. Why do we have a risk index for lightning-induced fires but not for human-caused fires? The lightning climatology gives us an indirect but definitive way to model the ignition probability which is essential for a true risk index. As we’ve noted before, lightning strikes are not in our control, so the ignition probabilities here are a given and can be thought of as occurring outside the system that we are modeling. In contrast, human-caused fires are largely in our control but ignition probabilities are difficult to estimate (although they can in principle be reduced to a tolerable level).

Fire susceptibility

We quantify the wildfire susceptibility of a location by the potential size of the fire if there was an ignition event at that location. It is based on a conditional probability given an ignition event. The susceptibility is a function of the fuel conditions on the ground, characteristics of the landscape and the recent or projected fire weather.

The susceptibility measure can be a powerful call to action. It is especially valuable information when it comes to human-caused fires, which as we’ve seen are becoming more dominant and largely in our control. If the fire susceptibility within a region is high, then it will require additional measures to proactively prevent ignition events such as faulty power lines, open burning or campfires in that region. Susceptibility numbers can also be used to drive targeted mitigation measures such as removal of forest debris and mechanical thinning of brush/forests.

Similar to the LFRI chart above, the chart below shows a systematic monthly view of the fire susceptibility distribution at 7500 locations across the western states between now and 2040. The susceptibility peaks in July, exceeding 1000 acres at 20% of all locations in 2023 and increasing to 37% in 2040.

The two video clips below (generated using the FireVision Susceptibility Maps) animate the evolution of fire susceptibility between now and 2040 for the months of May and July.

Fire cause

The monthly distribution of fire causes in the next chart generated using the FireVision Cause Maps is again conditioned on ignition events just like the fire susceptibility. The chart simply indicates the more likely cause of wildfires on a monthly basis across all ignition locations on a 20 km grid. Lightning-caused natural fires are much more likely in the summer months, while human causes are more likely at other times of the year.

What is not obvious in this chart— but something we’ve noted before — is that human causes will dominate certain geographical regions that are near the coast and near population centers, while natural causes will dominate in more remote areas and higher elevations.


  • Human activity is increasingly becoming the dominant cause of wildfires, accounting for 77% of the fires and 43% of the acres burnt in the 11 western states during 2011–2020.
  • Lightning as a natural cause triggers many of the larger fires in more remote locations and continues to cause the bigger part of the damage.
  • Fire sizes have increased by as much as a factor of two between the first and second decades of this century, pointing to a clear fingerprint of climate change.
  • Fires caused by power transmission lines are growing but still account for just 2% of the fires and 3% of the acres burnt.
  • ML models predict that the number of high-risk locations for lightning-induced wildfires in the western states will double (from 10% to 21%) between now and 2040 at the peak of the fire season.
  • The number of high-susceptibility locations for large fires from any cause will also nearly double (from 20% to 37%) between now and 2040.
  • The data and model predictions are clearly a call to action on wildfires. There is much we can do by way of prevention, mitigation and risk management if we have the right metrics to act on.
  • We’ve presented two powerful location-specific metrics here that can drive action on wildfires: a risk index and a susceptibility measure.

This article is also published on the author's blog. illuminem Voices is a democratic space presenting the thoughts and opinions of leading Sustainability & Energy writers, their opinions do not necessarily represent those of illuminem.

Did you enjoy this illuminem voice? Support us by sharing this article!
author photo

About the author

Kumar Venkat is the Founder and CEO of Model Paths. He served as the principal climate consultant for Climate Trajectories. In June 2021, he was appointed CTO of Planet FWD where he led the development of a best-in-class carbon accounting solution for the food and agriculture space.

Other illuminem Voices

Related Posts

You cannot miss it!

Weekly. Free. Your Top 10 Sustainability & Energy Posts.

You can unsubscribe at any time (read our privacy policy)