Risk Index for Shared Components

In summary, a widely used open source library has severe flaws that could potentially invalidate 40,000 scientific studies.
  • #1
anorlunda
Staff Emeritus
Insights Author
11,308
8,733
In an earlier thread, Science Vulnerability to Bugs, I mentioned the case
http://catless.ncl.ac.uk/Risks/29.60.html said:
Faulty image analysis software may invalidate 40,000 fMRI studies
Here is a similar case.
http://catless.ncl.ac.uk/Risks/29/59#subj8 said:
Severe flaws in widely used open source library put many projects at risk
In another recent case (can't find the link), an author decided to un-license his public domain contribution and withdrew it from publicly shared libraries, which broke very many products dependent on it.

I am not a computer scientist, but I had a very computer science like follow-up thought on this subject.

We can define Risk=Probability*Exposure, let's say R=P*E. When we apply that to a software component, P is the probability of a flaw in the software and E is the number of places where the component is used.

Our confidence in a component increases with time and with the diversity of use without flaws being reported, thus P is a function of time t and of E. With time E may also grow, so that E is a function of time t. The interesting question is which outraces the other. Most quality improvement programs focus exclusively on P.

Given those, we could compute R(t). We should then be able to use ##\frac{dR}{dt}## as an index of the accepability of risk. ##\frac{dR}{dt}=0## is a likely choice for the boundary between acceptable and unacceptable risks. It also suggests that capping E is an alternative to lowering P as a remedy for excessive risk.

This is the kind of question that I expect should have been explored when discussion of shared reuable software components first became popular in the 1980s.

My question is, has this subject been explored in computer science? If yes, are there links?
 
  • Like
Likes Greg Bernhardt
Technology news on Phys.org
  • #2
anorlunda said:
Most quality improvement programs focus exclusively on P.

Not sure if I understand your thinking well to say if I agree or disagree, but software is definitely released in stages of increasing E.

For example -

Unit test
Integration test at multiple levels
Handoff to test group for independant testing
Beta user testing, often multiple rounds with increasing E in each round
Production release
 
  • #4
Grinkle said:
Not sure if I understand your thinking well to say if I agree or disagree, but software is definitely released in stages of increasing E.

The link to 40000 ruined scientific studies is a better example of what I was thinking of. I think it was a flaw in a statistical library that was at fault.

I'm thinking of an open source stat function initially distributed to E=100 users but it becomes popular and gains E=1000000 users. I see no sliding scale of quality that assures P*E is mandated to not grow as things go viral and E skyrockets. An function with ##10^6## times as many users, should be able to afford at least ##10^2## times larger QA budget.

Not putting all our eggs in one basket is another way to express it. A stat library used by nearly everyone in science may be too risky. Just for sake of diversity, there should be several competing packages from different origins that gain large shares of the users. I have heard of military applications where the government ordered instances of the same software written by different suppliers to reduce risk, but I never heard of this practice gaining wide acceptance.
 
  • #5
I am not an academician, but fwiw, I offer my free-market perspective.

Standards are proposed and if adopted, may arguably improve the greater good. Many more standards are proposed than adopted by industry because individuals want to win, and most proposed standards involve some amount of change from status quo for many players, and some of the change is not perceived by some of the players as beneficial to them with respect to competitive advantage. It is often true that some players would be eliminated altogether if some standards where adopted, even though the standard is arguably better for the greater good.

If you were asked by a standards body to use software you personally thought was inferior to what you currently use now for the sake of global scientific risk reduction, would you agree? If you did not agree that the software you were being asked to use is actually functionally equivalent to what you are using now, would you agree? If you thought your research / publishing rival was influencing the standards committee to assign you the rubbish software, would you agree? etc.

I see a lot of human interaction factors involved in what you are suggesting.

edit:

Ignoring that, it is interesting to ponder modelling the comparative risk of deploying multiple less mature solutions vs fewer more mature solutions. Maturity comes from usage, and many software defect models predict that multiple less mature solutions present the user base with more total bugs than one single more mature solution. I think studies might need to be done on how many bugs are overlapping to see which is safer. If all solutions tend to suffer from similar symptom bugs, then many is (I think) by inspection worse than one. If bug symptoms are mostly unique to each individual implementation, then perhaps many can be better, even if total extant bugs are greater at any given point in time.
 
Last edited:
  • #6
In Boost.Multiprecision one can select which one of several very different backends to use by simply changing one line of code and recompile.

While not foolproof it certainly allows one to be more confident in the results if they don't change significantly when changing backends.
 
  • #7
Grinkle said:
This paper talks about Deployment metrics being an indicator of defect discovery rate, but it does not focus specifically on that aspect.

http://www.cs.cmu.edu/~paulluo/Papers/LiHerbslebShawISSRE.pdf

Thanks for the link. Yes, the paper looks at how P changes as the experience base grows. But it doesn't consider post deployment growth in E.

Lord Crc said:
In Boost.Multiprecision one can select which one of several very different backends to use by simply changing one line of code and recompile.

While not foolproof it certainly allows one to be more confident in the results if they don't change significantly when changing backends.
That sounds like a very intelligent way to handle multiple back ends, thus achieving some diversity. I wonder what motivated them to do that.

Grinkle said:
Ignoring that, it is interesting to ponder modelling the comparative risk of deploying multiple less mature solutions vs fewer more mature solutions. Maturity comes from usage, and many software defect models predict that multiple less mature solutions present the user base with more total bugs than one single more mature solution. I think studies might need to be done on how many bugs are overlapping to see which is safer.

Yes indeed. The point of my OP was not to advocate any remediations, but rather to advocate computer science research. Many in software engineering love the 20000 toot level look at the broad playing field, and big data from large number of projects.

I did think of one more scenario that paints the concern graphically. Suppose that we learned that a significant fraction of US military weapons systems were found to depend on a single software component; even though confidence in that component was extremely high. How much concern should that cause?

From a historical viewpoint, the Y2K bug comes to mind. The reason that Y2K caused so much concern, and so much diversion of resources to check for the bug, was precisely because it was so ubiquitous and cut across boundaries that we had believed made systems independent of each other. Y2K was an unsuspected common dependency.
 
  • #8
anorlunda said:
Y2K was an unsuspected common dependency.

If diversity of solution implementation is a proposed robustness improvement, then Y2K strikes me as a counterexample to that proposal. Diversity of implementation did not offer any robustness. It is an example of many different implementations all potentially containing different coding bugs that lead to the same symptom (that symptom being inability to differentiate between different centuries when the century changes). In this case, if there were only a single date calculation algorithm that all the Earth's software used, it would have been a trivial fix. It was exactly the diversity of implementation in date calculation approaches that caused the concern. Each implementation needed examining and fixing.

Can you draw a couple speculative graphs of how you envision a scenario where P*E is helped by multiple implementations over time and a scenario where it is not? P obviously increases with each independent implementation, and the rate of maturation of each implementation decreases as the user base is diluted. I think you were saying this in your OP?
 
  • #9
Grinkle said:
Y2K strikes me as a counterexample
The problem is dependencies on common things. In risk analysis we use the term, "common-mode failures." In the case of Y2K, the commonality was not a shared component, bur rather a shared method of expressing dates.

Grinkle said:
Can you draw a couple speculative graphs of how you envision a scenario where P*E is helped by multiple implementations over time and a scenario where it is not?
Consider the military example. Let's assume dependency on a software routine critical to 100% of USA's weapons. The threat is that an enemy cyberwar unit discovers a vulnerability in that common thing. They could knock out 100% of our weapons at once. If we had the diversity of two independent implementations of that thing, then their vulnerabilities would differ. The enemy would likely be limited to knocking out 50%^ of our weapons on one occasion. Losing 50% at a time on two occasions is much less scary than loosing 100% simultaneously..
 
  • #10
anorlunda said:
a shared method of expressing dates.

No method of date storage or expression was shared. The methods were arrived at independently by independent code developers, and they all suffered from defects that depended on the specific implementation but potentially showed a common symptom. Shared vs independently arrived at is a key distinction and very relevant to your thesis. It goes to why I am saying Y2K is counterexample to the argument that using different implementations for the same application will reduce the risk of being impacted by a code defect.
 

Related to Risk Index for Shared Components

1. What is a risk index for shared components?

A risk index for shared components is a quantitative measure that assesses the potential risks associated with using shared components in a system or project. It takes into account factors such as complexity, criticality, and potential impact of failure.

2. How is a risk index for shared components calculated?

A risk index for shared components is typically calculated by assigning numerical values to different risk factors, such as likelihood of failure, severity of consequences, and level of control over the component. These values are then combined using a formula to determine the overall risk index.

3. Why is a risk index for shared components important?

A risk index for shared components is important because it helps identify potential risks and prioritize them for mitigation. It also allows for a more efficient allocation of resources to address the most critical risks, reducing the overall risk of failure in a system or project.

4. What are some common risk factors included in a risk index for shared components?

Some common risk factors included in a risk index for shared components are complexity, criticality, level of integration, and level of control. Other factors may include security vulnerabilities, compatibility issues, and potential for single point of failure.

5. How can a risk index for shared components be used to improve system or project management?

A risk index for shared components can be used to improve system or project management by providing a clear understanding of potential risks and their level of impact. This allows for effective risk mitigation strategies to be implemented and for resources to be allocated more efficiently to address the most critical risks.

Similar threads

  • Introductory Physics Homework Help
Replies
13
Views
670
  • Classical Physics
Replies
0
Views
237
  • Atomic and Condensed Matter
Replies
0
Views
883
  • Advanced Physics Homework Help
Replies
3
Views
922
Replies
18
Views
1K
  • Special and General Relativity
Replies
3
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Computing and Technology
Replies
2
Views
2K
Replies
1
Views
634
Back
Top