University of Bielefeld -  Faculty of technology
Networks and distributed Systems
Research group of Prof. Peter B. Ladkin, Ph.D.
Back to Abstracts of References and Incidents Back to Root


Article RVS-J-97-04

To Drive or To Fly - Is That Really The Question?

Peter B. Ladkin

24 July 1997


Abstract: Some statistical comparisons were made of fatal accidents while flying on a commercial jet and while driving on rural interstates published during 1989-1991. They largely use data from the mid 70's through the 80's, and show that the risk of dying on a commercial jet flight was uncorrelated with the length of the flight, and that for trips of longer than 303 miles, flying was safer than driving if you were in the statistical group of `safe' drivers. The age and the type of data warrant some comments on the effectiveness of statistical comparisons, in particular how commercial passenger flying may have changed in the 90's.

At the end of a commercial flight, one may occasionally hear the captain announce `Welcome to Podunque and thankyou for flying Birdseed Airlines. The safest part of your trip is over. We wish you a safe onward journey.' Everyone seems to know this comparison, but where does it come from? From the magic of statistics maybe. After all "you can prove anything with statistics", yet this seems to be a substantive assertion about real risk and safety, rather than a lie, a damn lie, or even worse a statistic. A more reasonable proverb is that one can persuade people of almost anything by misuse of statistical figures - that is, by incorrect statistical reasoning that is hard for a layperson to critique. But believing only `the experts' also has its downfalls. I advocate believing and constructively criticising not the experts, but the experts' arguments. If you are presented with the arguments, you can see the reasoning. If you can see the reasoning, you can check whether it's valid or not. Hard work maybe, but it doesn't come any better than that. Well, so what are these arguments that flying is safer than driving? I found precisely three, in the journal Risk Analysis in 1990 and 1991, which I shall discuss.

First of all, what is risk? In this case, the risk seems to be the probability that one could die while engaging in a particular sort of activity (1). Thus one must somehow measure the number of people that died engaged in that activity (while flying or driving), and measure the exposure, that is, the length of time the activity was engaged in (miles, trips, or time) to obtain a rate (deaths per so-many miles, or per so-many trips, or per hour of time). One is also trying to make a comparison between two sorts of rates, and one doesn't want to be comparing apples to oranges. It thus seems sensible to see if such a comparison could help to determine a personal choice, say, whether to drive or to fly on a particular trip. Whether one should fly home from the airport or rather drive........

Making statistical comparisons of risk is itself a risky business (though one risks dissension rather than dissection). For example, let's take the simple question: how risky is driving? Evans, Frick and Schwing (EvFr90) note that the fatality risk for `high-risk' drivers is over 1000 times the fatality risk for `low-risk' drivers. That suggests right off that there may be no one simple figure which will suffice to answer the question.

One might be tempted to infer that the one stands less chance of being killed in a car accident on a day-long journey from San Francisco to Los Angeles than the other does when driving around the corner to buy bread. But noone has gathered statistics on, for example, the likelihood of having a fatal accident when tired after so-and-so many hours of driving, or when driving in the early morning while hungry and maybe tired or hung-over from the splendid evening before, or when angry or happy, or when the weather's gray rather than sunny; let alone on such specificities as driving to Los Angeles versus around the corner (in densely-packed SF, or in neighborly Eureka?) to buy bread.

Suppose we seat a high-risk driver next to a low-risk driver on a commercial jet aircraft. Then we may imagine that they have very similar probabilities of losing their lives on that trip: either that aircraft crashes or it doesn't; passengers seated near to each other have similar (but not identical) exposure to the hazard; and we assume (what cabin crews may tell you is certainly false) that they have similar capabilities to extricate themselves from the hazardous situation (they don't `freak out', and they're both physically fit enough to function even during smoke inhalation). But although it might be almost certain that the risk of dying on the aircraft trip is lower than the risk of dying while driving home for the high-risk driver (who, we shall see, is barely advised to take a trip around the corner), it is by no means certain that the flying risk for the low-risk driver is lower than that of going home. But even that's not clear. Both may be somewhat anoxic, having been at a cabin altitude of some 7,000 feet for a while, as well as dehydrated. And if our low-risk driver smokes regularly, (s)he is certain to feel the effects of the altitude, which, as aviation physiologists know and pilots should know, measurably slows reaction times and impairs judgement, as well as inhibiting one's ability to detect this impairment. Rather like being drunk without knowing it, maybe. I don't know any statistics on general air or traffic accidents that could enable the effects of these factors to be determined. To my knowledge the only physiological or psychological impairment for which general statistics have been gathered is that of alcohol consumption, and certainly not for hypoxia.

Importantly, statistics used for comparison can take little account of individual variations. For example, when I was a 40-year-old, I fit into the `low-risk' category. I had never had a car accident while moving. This logically implies (at least in a temporal logic) that I never had a car accident while moving when I was 18 years old, and, thus, in a much higher-risk category. Let's generalise. All of the drivers now in the low-risk category were in a much higher-risk category when they were younger - and of course did not then die. That means that the conditional probability that a low-risk driver died while in this high-risk driving group is, in fact, nil! One is tempted to conclude, correctly, that the safest driving group to be in is the group of drivers that are nominally high-risk but will become low-risk at a later time. But although this inference is statistically correct, physically it puts the cart before the horse - people who will live longer simply aren't going to die now, by logic. But it can be hard to detect this kind of semantic play in statistical argument unless one is practiced.

So simply belonging to a nominally high-risk group isn't enough to enable one to calculate one's individual chances of dying. There are risk factors that are within our control, and factors that are outside our control. When we have taken the decision to step on a particular commercial flight, the risk factors for an accident are mostly out of our control. When we drive our cars home, many of them are still within our control (we stay sober, drive at appropriate speeds, watch the environment carefully, and follow applicable control laws and signs) and others we can influence, even though they are not entirely within our control (choosing the freeway rather than busy city streets to drive through, driving outside of rush-hour, staying well away from heavy goods vehicles on wet roads). We can choose to exercise that control, or not. I don't know any method which reflects this degree of control choice in the statistics - nor any method that could, if we are talking of fatalities. One might say: if we choose not to exercise control, we become a statistic. If we choose to do so, who knows what the risks are? But ignorance should not be confused with reality. What is there to stop statistics being gathered on all of the features I have considered? Physically, it is hindered by the difficulties of deriving measurements of some of them (alertness? hypoxic impairment? chosen degree of driving care? all seem to require measurements while the subject is still alive); economically, it is certainly hindered by the cost of collecting them; socially, by the priorities of rescue services at the scene of accidents, whose first job is to tend the wounded, not to get them to fill out questionnaires on their deceased fellow travellers (that task can be left to the ambulance-chasers). So it is certain we shall remain statistically ignorant on many of the factors that would enable us to make a suitable judgement of our individual driving fatality risk.

Given that is so, what do we know? One thing we know is the frequencies of certain types of accidents relative to certain features of the environment. The ease of gathering data and calculating these past frequencies, coupled with a belief or a justification that `all other factors remained more-or-less equal' (ceteris paribus) and a belief or justification that these past statistics ceteris paribus are a good guide to the future, can lead us to interpret chance (or `probability' if you will) as this frequency. For example, if we knew how many individual passengers had travelled on US aircraft in the last ten years, and we know how many of those have died in aircraft crashes, then we can divide one number by the other to calculate the proportion of air travellers in the last ten years that have died in US crashes; we can assume ceteris paribus that the individual airline and the number of flights doesn't make that much difference, assume these past figures will be a good guide to the future, and conclude we have exactly that chance of dying on a US airline flight in the next ten years. The problem is, of course, that individual airline and number of flights do make a big difference, at least for the period 1975-1986 (BaHi89). And that is one of the known difficulties with the frequentist view. What factors have we not measured that do make a difference?

The numbers we come up with on a frequentist view cannot be objective properties of the situation. Suppose that Cornbread Airlines has a fleet of 20 100-seater aircraft that each fly 500 route segments per year, all completely full, and has never carried any individual person on more than one flight. So Cornbread has flown 5000 segments per aircraft in the last ten years, that is 100,000 segments, each with 100 non-returning passengers. It has had two 100-seater plane accidents in the last 10 years. Cornbread's fatal frequency over ten years is thus

(2 x 100) / (20 x 100 x 500 x 10) = 1 in 50,000 passengers per decade

Suppose Birdseed Airlines has twice as many of the same type of aircraft (40 planes), same passenger loading, same plane usage, no returnees, but has only had one fatal accident in this time. Birdseed's fatal decade-frequency is

(1 x 100) / (40 x 100 x 500 x 10) = 1 in 200,000 passengers per decade
and we are assuming that these figures are a reliable guide to the future.

Suppose my travel agent always books me on either Birdseed or Cornbread. They have a total of 60 planes and 3 accidents between them, so I can calculate the ten-year rate for my travel agent's choice as

(3 x 100) / (60 x 100 x 500 x 10) = 1 in 100,000 passengers per decade
So Birdseed's rate is half that of the joint rate, and Cornbread's twice that of the joint rate. Under the frequentist assumption that this is a good guide to the future, I have a certain chance of dying aboard a certain flight. Let that be X. But that flight will either be a Birdseed flight, in which case my chances seem to be 0.5X, or a Cornbread flight, in which case my chances seem to be 2X. So, whichever I fly, I get two totally different numbers for what is supposed to be the same objective situation: either X and 0.5X if I am in fact flying on Birdseed, or X and 2X if I am flying on Cornbread. Whatever my chances really are, if they are an objective fact, then there should be only one number that reflects them. Thus if they are objective, the simple frequentist view cannot determine them. In fact, which number I get seems to depend on what I know about my flight, rather than any objective characteristic of the flight itself.

This is a standard critique of the frequentist interpretation of chance, which may be contrasted with the Bayesian interpretation. The Bayesian interpretation in principle recognises the influence of factors we have not remarked, and interprets chances relative to our state of knowledge. An a priori probability is calculated, and this a priori probability is modified as significant events occur into an a posteriori probability, which takes into account the confirmation (or disconfirmation) of the a priori probability by the events. For example, suppose we have reason to believe that our design and construction processes allow us to build aircraft with a probability of X of suffering a major system failure of a certain type. X is our a priori probability. After testing and a certain amount of time in service, let us suppose we have suffered a number N of system failures of this type. Bayesian theory gives us a means of calculating the a posteriori probability Y as a function of X and N: Y = F(X,N), for some F given in textbooks on Bayesian theory.

So Bayesian theory attempts to introduce the epistemic relativity into estimating (and revising) chances that seemed to cause logical trouble with the frequentist view. Nevertheless, Bayesian approaches have their own problems -- see (Gly92, Chapter 8) for an explanation of Bayesian inference by a prominent Bayesian sceptic; (DaSo92, pp41-43, entry on Bayesianism) for a more detailed justification of how Bayesian inference works; and (DaSo92, pp374-378, entry on probability, theories of) for a comparison of the different ways, including frequentist ways, of calculating `chances': all three reference contain reasonable bibliographies for the avid bookworm.

Enough pontificating - let's go unashamedly frequentist for the remainder of the essay. Driving fatality risk is measured by (EvFr90) as the driving fatality rate: number of driver fatalities per billion miles driven. The 1987 data showed 46,386 fatalities, of which 37 per cent were drivers and 71 per cent were cars, occurring in 1924.3 billion vehicle miles of travel (presumably estimated). So,

46,386 x 0.37
-------------
1924.3 x 0.71
gives 12.56 car-driver fatalities per billion miles. This is a rate or frequency statistic. It tells one the past frequency of occurrence of an event relative to certain background information which is held constant.

Accurately measured cofactors of driving fatality include age, blood alcohol level, use of a seat belt, mass of the car, and type of road. (EvFr90) calculated rates for all combinations of factors:

They calculated `multiplicative factors' for all these. What's a multiplicative factor? The answers were That means that drivers who had over the (then) legal limit of blood alcohol had a frequency of fatal accidents of nearly eight times the base frequency (note that the base also includes them, so that the factor calculated against all others will be higher!), and over 10 times that of drivers with a BAC of 0. Similarly, unbelted drivers had a frequency double that of belted drivers. Rates in heavier cars were half that in lighter cars. Fatality rates on rural interstates were about half that of the base. All this shows that there is huge variation in rate, depending on which group one falls into.

Thus the rate for 40-year-old, alcohol-free, belted drivers travelling on rural interstates in a heavier car is thus 0.804 fatalities per billion miles (just multiply 12.56 by the factors above). In comparison, that for an 18-year-old, intoxicated, unbelted male driver on average roads in a lighter car is 930.8 fatalities per billion miles, over 1,000 times as high! From here on, `low-risk' shall mean drivers of the former population, and `high-risk', drivers in the latter population.

To make the comparison between driving and flying, (EvFr90) point out that the segments of the total journey on which there exists a real alternative, and thus a reasonably justified comparison, is the part of the journey between major metropolitan centers. They make the reasonable assumption that such a journey by road would be undertaken on a rural interstate. Thus they use journeys on a rural interstate for comparison. They also point out that the populations dying by car have a different age distribution that those dying by commercial plane. Car-driver deaths as a percentage of the total are extraordinarily high in the 19-20 age group (3.5 per cent of the total) and then taper off exponentially (so it seems) to 1 per cent at age 35 and 0.5 per cent at age sixty. (EvFr90) obtained mortality data (International Classification of Diseases category 841.3 - occupants of aircraft excluding crew), which form roughly a bell curve (OK, a Gaussian distribution) with median of about 10 per cent at age 40 for males, with 5 per cent levels at about 20 and 60; and a skewed bell for females, with about the same levels (4 per cent) as males at 20 and 60, but a skewed and flat median of 5 per cent at age 35. (EvFr90) corrected for the different distribution of population on commercial aircraft flights, and concluded that car drivers with a gender and age distribution of airline passengers have roughly a 24 per cent lower mortality rate than the base rate.

(EvFr90) concluded that `low-risk' drivers are

slightly less likely to be killed in 600 miles of rural interstate driving than in regularly scheduled airline trips of the same length. For 300-mile trips, the driving risk is about half that for flying. Hence, for this set of drivers, car travel provides a lower fatality risk than air travel for trips in the distance range for which car and air travel are likely to be competing modes.
The `high-risk' driver has a fatality rate over 1,000 times higher, as we have noted. But it wouldn't make much sense to conclude that for this class of driver, a trip of two-thirds of a mile by rural interstate carried a similar risk of a journey of equivalent length by commercial aircraft.......

These figures are not final, however. There are at least three sources of bias that arguably should be taken into account. (BaHi89) analysed airline fatality data from 1977-86 and argued

The second point may easily be believed from (Boe96), whose breakdown of fatal commercial jet accidents by flight phase showed only 8.2 per cent of fatal accidents worldwide occurring in cruise, based on a single normalised exposure of 1.5 total flight hours. That is, more than 90 per cent of fatal accidents occurred in takeoff-climbout-climb or landing, both of which of course must occur exactly once on each flight segment. (SiWe91) modified the figures of (EvFr90) to reflect the correlation with flight segment rather than with total distance. They asserted that (EvFr90) had calculated their figures based on the average length of a trip of 880 miles (derived from the 1980's data). However, the average length of a flight segment was 551 miles over this period, and so the 880-miles figure represented on average 1.6 (880 / 551) segments. They obtain a figure of 244 or 249 fatalities per billion segments (based on two different methods of calculating the figure). They recalculated probabilities of flying fatalities for individual drivers based on average segment lengths, and calculated the indifference distances (those distances at which the probability for a type-X driver were equal for flying or driving), based on a linear extrapolation of the discrete probabilities (a presumption they justified by pointing out that at these low probability levels, linear extrapolation was expected to be exceedingly accurate). They concluded
The indifference distances for a [single flight segment] and for low-risk, average, and high-risk drivers [...] are 303 miles, 37 miles, and 0.5 miles, respectively [...]. Consequently, for a low-risk driver it is safer to fly nonstop than to drive if the road distance is more than 303 miles [...].

[Suppose now that] only a three-segment flight is available. For such a situation, it is safer to fly than to drive if the distance is more than 909 miles for a low-risk driver, 111 miles for an average driver, and 1.5 miles for a high risk [sic] driver.

In the third example, a low-risk driver is considering a 602-mile trip (602 miles being the indifference distance obtained by [(EvFr90)] for a low-risk driver). However, the present calculations indicate that for such a situation the risk of driving is about twice the risk of flying.

As before, we may consider the `conclusion' for high-risk drivers to be a bit nuts. The figures show simply that flying is always safer for such a driver. The indifference distance is meaningless since it does not represent a real choice.

(Bar91) points out a further source of bias, acknowledged but not discussed, in the figures of (EvFr90):

Turboprops and piston aircraft have a much higher overall fatal accident rate than jet aircraft. (EvFr90) had worked with merged statistics for all sorts of planes, which distorted their data, since the death risk for jet travel in the years 1977-91 were 1 in 9.9 million per flight segment (figures for 1977-86 from (BaHi89), those for 1987-1989 in (Bar90), and the absence of fatalities from 1.1.90 to 15.8.90, and 16.8.76 to 31.12.76). He observed that the cumulative risk for a low-risk driver on a segment of 600 miles from the figures of (EvFr90) is 1 in 2.1 million, and therefore that the 600-mile road journey by a low-risk driver `entails roughly five times the mortality risk of the trip by jet'. Barnett suggests the equal-risk level for jets is about 130 miles, and that at these distances one doesn't really save any time by flying rather than driving.

What about the per-airline figures? There is currently controversy over a decision by an association advocating the interests of commercial airline passengers to publicise safety data broken down by airline. Airlines and some regulators believe that, while for some airlines seen to be unusually dangerous (no US airlines fit into this category) it would certainly be in the interests of passengers to make them aware of this situation, for most airlines a single accident could appear severely to damage an airline's reputation without necessarily being related to levels of safety. In simple terms, one unlucky accident can ruin your whole business. See (BaMe92) for a discussion about such biases that enter into public perception of risk after a major accident and their commercial effects.

There are some purely statistical biases that could also lead to a different risk assessment than seems reasonable. For example, Martinair (Holland) and Lauda Air (Austria) show up particularly badly on such measures because of a Martinair DC-10 accident on landing at Faro in the 80's, and the loss of a Lauda Air B767 - to date still the only B767 loss - in Thailand. However, Martinair performed a thorough `reengineering' of their operations (Anon97) and recently a Martinair crew performed a superb job of recovering a landing on a wet runway in Boston in 1996 during which they unexpectedly lost much of their braking power (Comp.Martinair). I have flown Martinair and liked what I saw of their professionalism. As for Lauda Air, no probably cause was found, although the DFD data were consistent with an in-flight deployment of actual thrust-reverse, an apparently unrevoverable fault which is supposed to be completely prevented by a hydro-mechanical interlock. Subsequent tests found, however, that there were some situations on a high-time engine in which certain oil seals could deteriorate and impede the correct functioning of a hydraulic valve in the interlock system, thus allowing actual deployment of reverse thrust. Details may be found in (Comp.LaudaAir). It could be legitimately questioned whether this accident yields appropriate grounds for ranking Lauda Air amongst some notorious safety transgressors in developing or ex-Soviet-Union countries. As we have seen, simple accident rates, on a frequentist interpretation. do not always appear to yield plausible conclusions. This particular statistical `bias towards rare events' can thus be misleading - when the risk being measured is dependent on very rare events such as commercial airplane accidents, those who use frequentist interpretations of statistics often find that comparisons yield much starker results than seems reasonably to be the case.

Finally, when assessing risks, one must always be aware of the changing environment in which evidence is gathered. (BaHi89) note that the period in which they analysed the statistics (late 70's to late 80's) was characterised by considerably increased terrorist activity targeting US carriers; by the rebuilding of air-traffic control services after the dismissal in 1981 of many of the most qualified staff by the Reagan administration in response to a strike over working conditions; and by airline deregulation which many worried had led to cost-cutting at the expense of safety measures. (BaHi89) were able to show that

[...] world airline safety [...] actually improved over that period at a far higher rate than might have been expected from historical trends. [...] that the carriers that performed best through the mid-1970's - and had the least room for improvement in the ensuing decade - nonetheless achieved as much on a proportional basis as their higher-risk counterparts. And we will argue that while U.S. domestic air travel became safer over the period of interest, the improvement would have been greater still in the absence of airline deregulation.

Those thinking about the changed environment for the period 1986-1996 might reflect upon

In any case, let the figures be as they may, just remember that it's demonstrably safer to take United to the grocery store if you're an inebriated teenager with a tiny car. And next time the captain reminds you how safe air travel is compared with using the car, regale her with tales from Evans, Frick, Schwing, Barnett, Sivak, Weintraub and Flannagan, and then remind her you're taking the bus home!

Peter Ladkin

Footnotes

(1): I should point out, though, that when one buries oneself in engineering, the definitions of risk can become complicated and abstract, for example (Lev95, Chapter 9, Terminology) .
Back

(2): In the wake of the Valujet DC-9 crash into the Florida Everglades in May 1996, for which the probable cause is expected to be determined as a cargo-hold fire caused by oxygen cylinders that had not been properly prepared for transportation and contained residual combustibles which caught fire. The airline management, the maintenance organisation which prepared the cylinders, and the Miami branch of the FAA responsible for oversight have all been discussed in preliminary hearings as possibly contributing to this dangerous situation, which in this case resulted in the worst possible outcome. The maintainer has been required to close its Miami and Orlando operations - some have suggested that this is shutting the door after the horse has bolted. The carrier was required to close its passenger operations and rework its management, and was permitted to recommence operations at a severely reduced level. It has recently merged with another small airline and will continue these operations under a new name. We await the NTSB report on the accident in mid-to-late 1997.
Back

References

(Anon97): Anonymous ex-Martinair Crew, personal communication, 1997. Back

(Bar90): Arnold Barnett, Air Safety: End of the Golden Age?, Chance: New Directions for Statistics and Computing 3:8-12, 1990. Back

(Bar91): Arnold Barnett, It's Safer to Fly, Risk Analysis 11(1):13, 1991. Back

(BaHi89): Arnold Barnett and Mary K. Higgins, Airline Safety: The Last Decade, Management Science 35:1-21, 1989. Back

(BaMe92): Arnold Barnett, John Menighetti and Matthew Prete, The Market Response to the Sioux City DC-10 Crash, Risk Analysis 12(1):45-52, 1992. Back

(Boe96): Boeing Commercial Airplane Group, Statistical Summary of Commercial Jet Aircraft Accidents, Worldwide Operations, 1959-1995, Author, 1996. Back

(Comp.LaudaAir): Peter B. Ladkin, Computer-Related Incidents with Commercial Aircraft, Section The Lauda Air B767 Accident. Back

(Comp.Martinair): Peter B. Ladkin, Computer-Related Incidents with Commercial Aircraft, Section Information about the Martinair B767 EFIS-loss Incident near Boston, MA. Back

(DaSo92): Jonathan Dancy and Ernest Sosa, eds., A Companion to Epistomology (Blackwell Companions to Philosophy Series), Blackwell, 1992. Back

(EvFr90): Leonard Evans, Micheal C. Frick and Richard C. Schwing, Is It Safer to Fly or Drive?, Risk Analysis 10(2):239-246, 1990. Back

(Gly92): Clark Glymour Thinking Things Through, MIT Press, 1992. Back

(Lev95): Nancy G. Leveson, Safeware: System Safety and Computers, Addison-Wesley, 1995. Back

(SiWe91): Michael Sivak, Daniel J. Weintraub and Michael Flannagan, Nonstop Flying Is Safer Than Driving, Risk Analysis 11(1):145-148, 1991. Back


Back to top

Back to `Incidents and Accidents' Compendium.


Copyright © 1999 Peter B. Ladkin, 1999-02-08
Last modification on 1999-03-16
Networks and distributed Systems mblume@rvs.uni-bielefeld.de