跳至内容

BeyondTest - Focusing Quality Everywhere

部分
个人工具
您位于: 首页 » Software Engineering » Measurement » Bayesian Network

Knowledge Engineering using Bayesian Network

Document Actions
Bayesian Belief Network

What is a Bayesian Network?

A Bayesian belief network or Bayesian network is a directed acyclic graph of nodes representing variables and arcs representing dependence relations among the variables. If there is an arc from node A to another node B, then we say that A is a parent of B. If a node has a known value, it is said to be an evidence node. A node can represent any kind of variable, be it an observed measurement, a parameter, a latent variable, or a hypothesis. Nodes are not restricted to representing random variables; this is what is "Bayesian" about a Bayesian network.

A Bayesian network is a representation of the joint distribution over all the variables represented by nodes in the graph. Let the variables be X(1), ..., X(n). Let parents(A) be the parents of the node A. Then the joint distribution for X(1) through X(n) is represented as the product of the probability distributions p(X(i) | parents(X(i))) for i from 1 to n. If X has no parents, its probability distribution is said to be unconditional, otherwise it is conditional. Figure 1 is a simple BBN example.

Figure 1 BBN example

The usage of BBN can be divided into following four steps:

  • Determine variable collection and variable domain
    This step is related to the problem domain. For the software reliability early prediction modeling, several complexity metrics are selected to build BBN.
  • According to prior knowledge, determine network topology structure and probability distribution
    The total number of topology structure can be n! with n node. So the prior knowledge from domain expert is very important to get most possible topology. And if domain expert can also determine the reasonable probability distribution, then a lot of effort for collecting training data can also be reduced.
  • Adjust topology structure and probability distribution using training data
    It's useful to introduce the prior knowledge from domain expert, but with the number of nodes increasing, such prior knowledge can be inaccurate. So, training data need to be collected to train the topology and probability distribution. Such adjustment is done by BBN learning algorithm. The purpose of learning is to use training data D and prior knowledge * to find the structure S whose posterior probability p(S|D, x) is maximal.
  • Predict using BBN
    Here, prediction is reasoning. Bayesian reasoning is to calculate the conditional probability of nodes according to the information of other nodes. According to the dependency between the nodes, BBN reasoning can be divided into three types:
    • Causal reasoning, which is used to predict result according to the cause. It's a top-down reasoning.
    • Diagnosis reasoning, which is used to analyze cause according to result. It's a bottom-up reasoning.
    • Support reasoning, which is used to analyze the effect between each cause.

Comparing to the traditional regressive modeling, BBN has following advantages:

  • BBN is highly related to Bayesian statistic. It's useful for establishing the relationship between knowledge and data.
  • BBN can deal with the data set, which may be not self-contained or there are some noise inside, while traditional model can't.
  • Because BBN can depict both the causality and probability, it's easy to be used with the process of decision-making.
  • BBN uses graphic style to depict the dependency between data, so it's easy for understanding and explanation.
Quantitative Knowledge Engineering Process using Bayesian Network

--which was extracted and refined from web material: Using Bayesian Networks for Water Quality Prediction in Sydney Harbour.

What have Bayesian Networks been used for?

Defect Detection - software debugging, safety and risk evaluation of complex systems

...

Related topic: BNN (Bayesian Neural Network)

Detailed information can be found here.

 Up
White papers & Articles


Dealing with the Expert Inconsistency in Probability Elicitation ,

Knowledge Engineering for Bayesian Networks ,

Designing a Procedure for the Acquisition of Probability Constraints for Bayesian Networks ,

Generating Conditional Probabilities for Bayesian Networks: Easing the Knowledge Acquisition Problem ,

Induction of Bayesian Networks with a priori Domain Knowledge ,

Knowledge Engineering for Probabilistic Models: A tutorial ,

Using Sensitivity Analysis for Selective Parameter Update in Bayesian Network Learning ,



Gary D. Boetticher, Machine Learners Answer the 300-Billion-Dollar Question , University of Houston-Clear Lake

Fenton, N., and M. Neil, A Critique of Software Defect Prediction Research , IEEE Transaction on Software Engineering, Vol. 25, No. 5, 1999

S. Bibi, I. Stamelos, Software Process Modeling with Bayesian Belief Networks , IEEE Transaction on Software Engineering, Vol. 25, No. 5, 1999

S. Bibi, I. Stamelos, L. Angelis, Bayesian Belief Networks as a Software Productivity Estimation Tool , Department of Informatics, Aristotle University

Trevor Cockram, Gaining confidence in Software Inspection using a Bayesian Belief Model , Rolls-Royce plc and The Open University

Jilles van Gurp & Jan Bosch, Using Bayesian Belief Networks in Assessing Software Architectures , University of Karlskrona Ronneby, Department of Software Engineering and Computer Science

Hadar Ziv, Debra J. Richardson, Bayesian-network Confirmation of Software Testing Uncertainties , Department of Information and Computer Science, University of California, Irvine

 Up
Good books

Probabilistic Reasoning in Intelligent Systems, First Edition : Networks of Plausible Inference
By: Judea Pearl


 Up
Well-known tools

Genie & Smile Bayesian network tool

Genie is a program providing support for inferencing with Bayesian networks and influence diagrams. It has been developed by the Decision Systems Laboratory, University of Pittsburgh and is available for research purposes (http://www.sis.pitt.edu/~dsl).
Genie provides a graphical development environment for editing Bayesian networks and influence diagrams and to perform inference with them. The networks are solved using the junction tree algorithm (like Hugin). Interactions between variables may be defined using conditional probability tables. The models are saved in various file formats. Genie has a propritary format but supports also many others, like the Netica format and the standard proposal format.
Additionally, an application programmer's library (API) C programmer's interface, called SMILE, has been made available for integrating the system to other programs. The library may be used for research purposes. The source code is not available.
The graphical user interface is supported in MS/Windows and Linux platforms. The API is supported in MS/Windows and Linux.
While testing, the tool seemed to be very stable and easy to use. It serves useful purpose while making it possible to test Bayesian network formalisms in an easy way.


J Cheng's Bayesian Belief Network Software

BN PowerConstructor: An efficient system that learns Bayesian belief network structures & parameters from data.
BN PowerPredictor: A data mining system for data modeling/classification/prediction. It extends BN PowerConstructor to BN based classifier learning.
Data PreProcessor: A tool used with BN PowerConstructor and BN PowerPredictor for pre-processing the training data.


HUGIN Bayesian network tool (Commercial)

The HUGIN system is a tool for constructing Bayesian network based inference modules for decision support systems. These modules are able to represent uncertainty in the status of the variables and in the probabilistic dependencies between the variables. Also influence diagram representations are supported.
The HUGIN system provides both an application programming interface (HUGIN API) and a graphical environment and development facilities for interactively defining Bayesian network structures and associated probability matrices.


Netica Bayesian network tool (Commercial)

The program provides a graphical development environment to edit Bayesian networks or influence diagrams and to perform inferences with them. The networks are solved using the junction tree algorithm (like Hugin). Interactions between variables may be defined using conditional probability tables or using equations. The probabilities may also be learned form training cases. The system supports delayed links between variables. Such models are automatically transformed into static models.
It is possible to reverse individual links of the network (the tool updates the probabilistic dependencies automatically) and also to remove nodes (the system updates the probability of the other nodes as appropriate).
Additionally, an application programmer's library (API) C programmer's interface is available for integrating the system to other programs.

 Up
Wonderful web resources


Mining Software Engineering Data: A Survey

Software organizations have often collected volumes of data in hope of better understandingtheir processes and products. Useful information has been extracted from those large volumes of data, but it is commonly believed that large amounts of useful information remains hidden in software engineering databases.
Data mining has appeared as one of the tools of choice to better explore software engineering data. Data mining can be defined as the process of extracting new, non-trivial, and useful information from databases. This broad definition covers a wide spectrum of methods, techniques, and tools. This State of the Art Report (SOAR) discusses how data mining can be, and how it has been, used to analyze software engineering data.


PROMISE Software Engineering Repository contains a collection of publicly available datasets and tools to serve researchers in building predictive software models (PSMs) and software engineering community at large. The repository is created to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering.



Software estimation, benchmarking, productivity, risk analysis, and cost information for software developers and business


Software cost estimation
Software cost estimation is the process of predicting the amount of effort required to build a software system. Models provide one or more mathematical algorithms that compute cost as a function of a number of variables. Size is a primary cost factor in most models and can be measuring using lines of code or function points. Models used to estimate cost can be categorized as either cost models or constraint models. COCOMO is an example of a cost model and SLIM is an example of a constraint model. Although criteria for evaluating a model have been suggested, there are some fundamental problems with existing models. Many models are available as automated tools.

Bayesian Elicitation of Experts Probabilities

The Probability Elicitation Tool

The Third Bayesian Modeling Applications Workshop During UAI-05, Uncertainty in Artificial Intelligence 2005 Edinburgh, Scotland, UK

the big guy who are doing some project related to BBN

 Up
创建人: beyondtest
上次修改时间: 2006-03-28 09:55 AM
« 2009十一月 »
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          

 
 

使用Plone加强

本网站符合如下标准: