How Safe is the Boeing 737 Max's MCAS System?

PeterDonis · Sep 23, 2019

etudiant said:

I believe the MCAS operation differed from that of runaway trim in that with MCAS, trim could be restored, but after a six second interval, MCAS would aggressively trim down again.

The symptoms are not identical, that's true. But that's part of the point being made by the author of the article: a "rote pilot" only learns what to do if a particular set of symptoms occurs exactly as he learned it in training; he doesn't learn a more general understanding of what the various systems do and how they interact. But most failures in flying do not present exactly the symptoms the pilot learned in training, so a pilot who only learns how to respond to those specific symptoms is at a disadvantage.

etudiant said:

In subsequent tests, FAA flight crews using the simulator were unable to recover the airplane in a sufficiently high percentage of the runs to cause consternation among the regulators.

Yes, as I've already said, I don't think that MCAS itself is safe.

FactChecker · Sep 23, 2019

PeterDonis said:

The symptoms are not identical, that's true. But that's part of the point being made by the author of the article: a "rote pilot" only learns what to do if a particular set of symptoms occurs exactly as he learned it in training;

Computer logic can get too complicated to recognize all the possibilities and anticipate how it will react to your actions. A flight-critical system must be very fault-tolerant and the pilots must be trained for all its modes.

FactChecker · Sep 23, 2019

etudiant said:

I believe the MCAS operation differed from that of runaway trim in that with MCAS, trim could be restored, but after a six second interval, MCAS would aggressively trim down again.

Apparantly the MCAS system had as much authority and was active for longer periods than it gave the pilots. That, in addition to its lack of redundancy and inability to recognize that the pilot was fighting it, was a tragedy waiting to happen.

anorlunda · Sep 23, 2019

FactChecker said:

all its modes.

It is those many modes themselves that give rise to many of the problems. Many modes is anti simplicity and ease of understanding.

For example, the power steering and power brakes (ignoring ABS brakes) in your car have only a single mode. They do not cause confusion. Whether the details of their implementation are dumb or smart, analog or digital, is immaterial.

hutchphd · Sep 23, 2019

PeterDonis said:

Yes, as I've already said, I don't think that MCAS itself is safe.

What bothers me most is that the motivation here had nothing to do with good engineering. This was an attempt to use the aerodynamic trim system in a dynamic way to make the aircraft emulate the better flight control characteristics of its predecessors in the series. Rather then do the necessary mechanical redesign to incorporate the more efficient engines in an aerodynamically sound way, this much less robust kluge was initiated, approved, and insufficiently tested.
It would be very good to know the machinations by which this occurred.

FactChecker · Sep 23, 2019

anorlunda said:

Whether the details of their implementation are dumb or smart, analog or digital, is immaterial.

The complexity of a digital system can easily be orders of magnitude more complicated than a realistic analog system. A well-designed system can smoothly transition between many modes without the pilot needing to change his behavior (of course, there are exceptions). IMHO, the flaws in the MCAS design were very serious.

anorlunda · Sep 23, 2019

FactChecker said:

The complexity of a digital system can easily be orders of magnitude more complicated than a realistic analog system. A well-designed system can smoothly transition between many modes without the pilot needing to change his behavior (of course, there are exceptions).

Complexity and operating modes played a major role in the USS John S McCain collision. Note that the Navy recently announced that they are returning to steering wheel and throttle levers on all Navy ships. I think that is significant that they did not call for better design of the digital systems, but chose to revert to the ancient wheel and throttle lever method.

https://en.wikipedia.org/wiki/USS_John_S._McCain_and_Alnic_MC_collision said:

In August 2019, Admiral Bill Galinis, who oversees U.S. Navy ship design, said the touchscreen-based control systems were "overly complex" because shipbuilders had little guidance on how they should work, so sailors were not sure where key indicators could be found on the screen; this confusion contributed to the collision. The Navy is planning to replace all touchscreens with wheels and throttles on all of its ships, starting in mid-2020.

OCR · Sep 24, 2019

OK, I'm just .~~experimenting~~. fooling around here, but I wanted to see if I could make a

link to that USS John S. McCain incident you posted about. . . looks like it worked

.

Wikipedia said:

About the USS John S. McCain and Alnic MC collision:

In August 2019, Admiral Bill Galinis, who oversees U.S. Navy ship design, said the touchscreen-based control systems were "overly complex" because shipbuilders had little guidance on how they should work, so sailors were not sure where key indicators could be found on the screen; this confusion contributed to the collision. The Navy is planning to replace all touchscreens with wheels and throttles on all of its ships, starting in mid-2020.

I hadn't read about the incident you posted, and right at first I thought you were

referring to this one. . .

1967 USS Forrestal fire - Wikipedia"On that Saturday morning in July, as I sat in the cockpit of my A-4 preparing to take off, a rocket hit the fuel tank under my airplane."

- John McCain -

.

gmax137 · Sep 24, 2019

hutchphd said:

It would be very good to know the machinations by which this occurred.

I have no knowledge on this issue, but I suspect that the Boeing customers (or maybe their biggest customer) said,
"We will by airplanes that:
- improve fuel economy by XX percent
- do not require pilot re-certification
- do not require changes to our existing gates
And if your design does not meet these requirements, we will go to Brand X instead..."

We all know now that the design that Boeing came up with to meet these requirements is flawed. But maybe the requirements are also flawed?

Dr Transport · Sep 24, 2019

gmax137 said:

I have no knowledge on this issue, but I suspect that the Boeing customers (or maybe their biggest customer) said,
"We will by airplanes that:
- improve fuel economy by XX percent
- do not require pilot re-certification
- do not require changes to our existing gates
And if your design does not meet these requirements, we will go to Brand X instead..."

We all know now that the design that Boeing came up with to meet these requirements is flawed. But maybe the requirements are also flawed?

It all boils down to $$$...

hutchphd · Sep 24, 2019

gmax137 said:

I have no knowledge on this issue, but I suspect that the Boeing customers (or maybe their biggest customer) said,
"We will by airplanes that:
- improve fuel economy by XX percent
- do not require pilot re-certification
- do not require changes to our existing gates
And if your design does not meet these requirements, we will go to Brand X instead..."

We all know now that the design that Boeing came up with to meet these requirements is flawed. But maybe the requirements are also flawed?

But part of Boeing's charge is to manage the expectations of their customer. That is what good management does. When told "I want it cheap,fast, and good" , the response has to be "you can choose two out of three"...
I feel certain there was a cadre of engineers at Boeing who were fully aware the quality of this effort. I wonder if they are still employed there (where else would they go?)...sad to watch the death spiral of another great technical organization.

nsaspook · Oct 7, 2019

https://www.msn.com/en-us/news/world/engineer-ethiopian-airlines-went-into-records-after-crash/ar-AAIpgFP?ocid=spartanntp

SEATTLE (AP) — Ethiopian Airlines' former chief engineer says in a whistleblower complaint filed with regulators that the carrier went into the maintenance records on a Boeing 737 Max jet a day after it crashed this year, a breach he contends was part of a pattern of corruption that included fabricating documents, signing off on shoddy repairs and even beating those who got out of line.

Johnny Yuma · Oct 29, 2019

I skimmed thru the posts and got more confused as I read. I am not conversant with this subject. I know nada.

I did read somewhere 5-6 months ago that Boeing installed larger engines, which are heavier but more fuel efficient. They did not factor in something when re-installed ...and that this is when stability issues started. The MCAS was installed to fix this. Any truth to this ?

PeterDonis · Oct 29, 2019

Johnny Yuma said:

Boeing installed larger engines, which are heavier but more fuel efficient.

Yes. Also, because the engines are larger, they had to be moved forward on the wing so they wouldn't get too close to the ground when the plane was on the ground.

Johnny Yuma said:

They did not factor in something when re-installed ...and that this is when stability issues started. The MCAS was installed to fix this.

It's not that they didn't factor in the effects of the new engines; they did. The fact that the new engines were further forward on the wing caused a change in the plane's behavior, and Boeing knew about that change from the start and factored it into their planning. The issue was the way they did so.

The simplest and most straightforward way to deal with the engine change would have been to ask the FAA for a new type certificate for the 737 MAX because its behavior was different enough from other 737s due to the engine change. (The engine position in itself is not an issue; plenty of other aircraft types, including other Boeing types like the 757 and 767, have the engines forward on the wing like the 737 MAX does, so getting a new type certificate would not have been an issue from a technical standpoint.) The problem was that this would have required all pilots to get new type certifications to fly the 737 MAX, and that's a long and arduous process that Boeing didn't want to force its customers to go through with all of their pilots in order to buy the 737 MAX (and it seems pretty clear the customers wouldn't have wanted to do it even if Boeing tried to make them; they would just have bought Airbus aircraft instead).

The alternative Boeing chose was to add the MCAS system to the 737 MAX to automatically compensate for the effects of the engine change, in order to make the 737 MAX similar enough to other 737s from the pilot's point of view to allow it to share the same FAA type certification, and therefore to allow any pilot certified in the 737 type to fly it with only minor retraining (which has to happen any time a new version of any aircraft type is rolled out). That turned out not to work out well.

Ken G · Oct 29, 2019

Yes, it seems that the idea of using MCAS to avoid recertification was reasonable, the problem was that they also tried to downplay its significance-- to the point that some flight crews didn't even know it was on the plane, and few, including the maintenance crews that worked on the one critical angle-of-attack sensor that MCAS was built to rely on, seemed to understand how crucial it was that MCAS received good data. The system did not necessarily even report when the two angle-of-attack sensors didn't agree, even though only one was used by MCAS. That just doesn't seem like solid design, but worse is that the design weakness was not well publicized. The only thing more dangerous than an underdesigned critical system is not being open with the information about the potential dangers.

jedishrfu · Oct 29, 2019

Yes and they did all this to compete with Airbus who able to use the more fuel efficient engines but without changing their plane‘s flight behavior.

hutchphd · Oct 29, 2019

I found this wikipedia article to be a remarkably complete and damning litany of bad management-driven engineering. In particular the dynamic use of the trimming system to make the aircraft emulate its progenitors seems reckless in the extreme.
https://en.wikipedia.org/wiki/Maneuvering_Characteristics_Augmentation_System

FactChecker · Oct 29, 2019

hutchphd said:

In particular the dynamic use of the trimming system to make the aircraft emulate its progenitors seems reckless in the extreme.

It is possible to safely do all sorts of things with a flight control, including trimming, but appropriate care must be taken. An extreme example is the F-35 flight control, which can seamlessly transition from hovering to forward flight. It is also possible to implement safety features like an auto-pitch rocker for stall recovery and like terrain avoidance. But all that must be carefully done, with redundancy, fault mitigation, and appropriate control authority. If done right, these can greatly improve the safety of the plane. It doesn't seem like Boeing followed basic safety principles in the MCAS design.

hutchphd · Oct 29, 2019

FactChecker said:

It is possible to safely do all sorts of things with a flight control, including trimming

This is doubtless true but it seems pretty clear that this route of implementation was chosen (for marketing reasons!) primarily because it is invisible to the pilot. That is a reckless decision on its face.

russ_watters · Oct 29, 2019

hutchphd said:

This is doubtless true but it seems pretty clear that this route of implementation was chosen (for marketing reasons!) primarily because it is invisible to the pilot. That is a reckless decision on its face.

I don't understand this position. The entire point of automated stability augmentation systems is to change the "feel" of an airplane so that it feels different/better to the pilot. If it works properly, the pilot never knows how the plane would "feel" without it. In that sense, they are always inherently invisible; that's what they are for.

The issue, to me, is that this particular system was poorly implemented, having a failure mode that was way, way worse than the behavior it was there to correct. The reckless part isn't that it existed, it is that it was allowed to exist in what should have been (and may have actually been) an obviously faulty implementation.

hutchphd · Oct 29, 2019

russ_watters said:

The entire point of automated stability augmentation systems is to change the "feel" of an airplane so that it feels different/better to the pilot

The 737 is not (I think) a fly by wire aircraft so the question is what is a necessary and sufficient reason to add an extra layer of complexity to an absolutely vital control system. Any increase in complexity augments risk.

To my mind the only reason for the system was marketing; allowing pilots to fly without any recertification. Trading nontrivial flight-control risk for marketing points is reckless behavior and bad engineering in my book.

artis · Oct 30, 2019

Sure I can see how if the pilots were more professional they could have in theory escape their fate like the crew before them did, but it is an absolutely idiotic engineering decision to make a product for mass consumption that requires in all cases the expertise and experience of a "stable genius".
Even good pilots differ , after all their just people, some may have lower stress tolerance in extreme situations while having the same experience and capabilities of other good pilots.I personally believe that in each device or gadget we engineer first the hardware has to be at it's best possible so that it performs flawlessly and the only thing that limits the performance is the laws of physics themselves and then we can add software and "gizmos" on top of that to push that performance even further.
In this case I assume they took a working plane with a proven track record(the previous 737 being around since the 1970's) then messed it up , did some changes without full risk assessment, then realized that there are flaws but instead of doing a full redesign just applied a software patch.
This all reminds me of how I "fixed" a broken gas pedal on a car that I was driving, I attached a string to the carburetor main air valve and gave the string to my friend and said , pull whenever I say pull and let go when I say let go. I got home without crashing but the experience of not having control over a vital aspect of driving was rather ugly.

artis · Oct 30, 2019

I recommend this video, it's a short , easy to understand summary of the main reasons why the 737 was made as it was.

Without any political or cultural/economical bias I would dare to suggest that this is one of the examples where capitalism fails the consumer, because safety and engineering in general in this case as many others has to compete not with science and the limits of physics but rather with economics and shareholders.

PS. I think it's easier to win over the laws of nature than the minds of humans

FactChecker · Oct 30, 2019

hutchphd said:

this route of implementation was chosen (for marketing reasons!) primarily because it is invisible to the pilot. That is a reckless decision on its face.

That is too strong a statement. It is ideal if a change is invisible to the pilot. It is due to other aspects that the design was dangerous.

anorlunda · Oct 30, 2019

This thread is so long that it is impractical to search past posts. One of the earlier posts (can't find it today) mentioned longer landing gear as an alternative to moving the engines forward and thus eliminating the need for MCAS. He said that the engineering work for longer landing gear had already been completed, but not used on the MAX.

I would like hearing more about that angle. Also, if anyone can find that earlier post in this thread and give a link, I would be grateful.

artis · Oct 30, 2019

Just as a sidepoint if someone has the data, I wonder how much Boeing has lost due to all of this saga, and how much they would have lost if they simply delayed the latest upgrade a bit but done it right from a physics view.
I can bet that in the long term they will lose more due to this short sighted thinking than if they done it right in the first place.@anorlunda I think the main reason was the same as already mentioned in my video, Boeing simply wanted to cut corners and save money, they essentially wanted a 737 but with updated electronics and better fuel economy, making longer landing gear would also probably need to make changes in the main airframe itself because the holes holding the gear are only so big.
In fact for the 737 max 10 they made the landing gear extend out more and then when going back into retract it's length like a telescopic antenna almost.
They introduced that extra complexity in the gear just so that they don't have to redesign the chassis.
https://www.geekwire.com/2018/boeing-737-max-10-landing-gear/

see this link.

russ_watters · Oct 30, 2019

hutchphd said:

The 737 is not (I think) a fly by wire aircraft so the question is what is a necessary and sufficient reason to add an extra layer of complexity to an absolutely vital control system. Any increase in complexity augments risk.

I'm not clear on why you are bringing fly by wire into this. If you mean that the more direct control of non fly by wire should be inherently less risky, I'd say that's an oversimplification. While it is true that issues of complexity and pilots literally not knowing how/if their inputs were moving control surfaces has contributed to [all fly by wire] Airbus crashes, it's also likely prevented crashes by not allowing pilots to make improper demands on the aircraft. There's pros and cons. And it's not just about safety; ergonomics, and economics play a role too. It's a complex balance. It's not black and white.

To my mind the only reason for the system was marketing; allowing pilots to fly without any recertification. Trading nontrivial flight-control risk for marketing points is reckless behavior and bad engineering in my book.

Well, but that's just it; if MCAS existed to counter a "non-trivial flight control risk", that would be a stand-alone problem; a plane with a less than sufficiently safe flight control system should not be certified to fly, period. Safety is a stand-alone consideration, up to a minimum floor.

Having to re-certify pilots to operate a new plane is an inconvenience, not a safety problem.

russ_watters · Oct 30, 2019

artis said:

Just as a sidepoint if someone has the data, I wonder how much Boeing has lost due to all of this saga, and how much they would have lost if they simply delayed the latest upgrade a bit but done it right from a physics view.
I can bet that in the long term they will lose more due to this short sighted thinking than if they done it right in the first place.

I'd say that the cost of doing it sufficiently right the first time would have been close to zero. The problem is two-pronged:

1. Poorly written software. If the software had been written better, we likely would never have heard of this issue. And it would have cost essentially nothing.

2. Lack of robustness in the control system (use of only one aoa sensor). This is what is causing most of the implementation delays, and would have been a multi-million dollar issue during design. But it is apparently a long-standing but evidently minor weakness in Boeing aircraft that hasn't caused significant issues before.

But as you suggest, even #2 would have been many orders of magnitude cheaper than the tens of billions this will end up costing.

PeterDonis · Oct 30, 2019

russ_watters said:

If the software had been written better, we likely would never have heard of this issue.

I'm not sure that the single aoa sensor issue could have been entirely mitigated just by writing better software. Better software might have reduced the severity of the aoa sensor failure mode to the point where an incident like the Lion Air or Ethiopian Airlines crashes would have been non-fatal, but I think we would still have heard about them and the issue would still have surfaced.

russ_watters · Oct 30, 2019

PeterDonis said:

I'm not sure that the single aoa sensor issue could have been entirely mitigated just by writing better software. Better software might have reduced the severity of the aoa sensor failure mode to the point where an incident like the Lion Air or Ethiopian Airlines crashes would have been non-fatal, but I think we would still have heard about them and the issue would still have surfaced.

Maybe, but yes, that's my point. I watch incident report videos on youtube a lot and it amazes me the severity of near-misses that never make the news*. It seems like it requires a smoking hole to be newsworthy.

*Yesterday I watched one about a commuter jet pilot receiving confusing ATC instructions and descending to 7,800' in an area with a mandatory floor of 10,000'. The pilots didn't catch the error until their avionics told them to pull-up to avoid terrain. That's often the last thing the pilot hears a couple of seconds before impact. So instead of 20 people dead, it's a stiff drink and some paperwork, and few other people ever hear of it.

[edit] One other thing I learned is that motorized and/or automatic trim problems happen a lot.

PeterDonis · Oct 30, 2019

russ_watters said:

I watch incident report videos on youtube a lot and it amazes me the severity of near-misses that never make the news*.

Hm, yes, that's a valid point.

russ_watters · Oct 30, 2019

PeterDonis said:

Hm, yes, that's a valid point.

...and not for nothing, but the single-sensor-single-computer architecture is decades old. I'm not sure the extent to which it was known/ considered a problem before, but clearly not enough to prompt a change before MCAS.

Still, increasing complexity increases the number of failure modes, so that issue would only increase over time. So it is tough to know either way -- so you may be right...and I suppose ultimately these accidents were that trigger-point that prompted the change.

hutchphd · Oct 30, 2019

russ_watters said:

I'm not clear on why you are bringing fly by wire into this

On a fly-by-wire system this attempt to mimic handling characteristics of a different airplane would have been much more straightforward (not the kluge that eventually resulted). In addition there would have been extant protocols for retest and they would likely have caught major flaws.

russ_watters said:

Having to re-certify pilots to operate a new plane is an inconvenience, not a safety problem.

Yes I could not agree more. So why did Boeing, in order to sell more aircraft, sacrifice design integrity to remove this "inconvenience" from their customer.

IMHO: The overarching issue here is not one of bad engineering or insufficient testing. It is an indicator of a defect in corporate culture. The fact that it took place in a paragon of engineering excellence is troubling in the extreme. Let us not get lost in the technical detail

PeterDonis · Oct 30, 2019

russ_watters said:

the single-sensor-single-computer architecture is decades old

Yes, but AFAIK nothing before MCAS enabled the single sensor and single computer to take uncommanded actions that could put the plane into an unrecoverable situation if the actions were wrong.

IMO any system in a plane that can take uncommanded actions at all needs to have multiple sensors and the corresponding sensor failure detection, and if sensor failure is detected the system disables itself and tells the flight crew. One of the things that shocked me about some of the Airbus incidents (e.g., Quantas 72) was that, even though the plane had multiple aoa sensors, the automated system that triggered multiple uncommanded pitch down events only used one of them and did not even look at the others to check the one sensor. That seems insane to me.

russ_watters · Oct 30, 2019

PeterDonis said:

Yes, but AFAIK nothing before MCAS enabled the single sensor and single computer to take uncommanded actions that could put the plane into an unrecoverable situation if the actions were wrong.

I'm not sure the extent of its influence, but I would have assumed that the flight control computer's primary if not sole reason for existing is to make uncommanded actions.

IMO any system in a plane that can take uncommanded actions at all needs to have multiple sensors and the corresponding sensor failure detection, and if sensor failure is detected the system disables itself and tells the flight crew. One of the things that shocked me about some of the Airbus incidents (e.g., Quantas 72) was that, even though the plane had multiple aoa sensors, the automated system that triggered multiple uncommanded pitch down events only used one of them and did not even look at the others to check the one sensor. That seems insane to me.

Agreed, but I wonder if we have a modern bias? To ironically quote Apollo 13; today we have "computers that can fit into a single room and hold millions* of pieces of information..." Today we consider processing power to be an utter triviality [new thread idea...].

*I'm not sure that was even true; it was probably dozens or hundreds.

How Safe is the Boeing 737 Max's MCAS System?

Similar threads

Hot Threads

Recent Insights