Data mining Bayesian Classification

Samundeeswari

 Data Mining Bayesian Classification

In many real-world situations, the connection between a set of attributes and the class label (the outcome we want to predict) is uncertain. This means that even if a test case has the same attributes as some training examples, we cannot be entirely sure about its class label. Such uncertainty can arise due to noisy data or other influencing factors not accounted for in the analysis.


For example, consider predicting whether a person is at risk of liver disease based on their eating habits and exercise routine. While generally, healthy eating and regular exercise reduce the likelihood of liver disease, a person might still develop it due to other factors like consuming street food or alcohol. Additionally, determining whether someone's diet is truly healthy or if their exercise is effective can also be subjective, leading to potential errors in the prediction process.


Bayesian classification, based on Bayes' theorem, helps make predictions by applying probability. Bayesian classifiers are statistical models that use probability to estimate the likelihood of different outcomes. Bayes' theorem, introduced by Thomas Bayes, uses conditional probability to provide a way of updating our beliefs based on new evidence, allowing us to calculate probabilities for unknown outcomes.


Bayes' Theorem is a mathematical formula used to calculate the probability of an event based on prior knowledge of conditions related to the event. It is expressed as:


P(A|B) = {P(B|A).P(A)}/{P(B)}

Where:

P(A|B)is the posterior probability, or the probability of event A occurring given that B is true.

P(B|A)is the likelihood, or the probability of event B occurring given that A is true.

P(A)is the prior probability of event A, or how likely A is without any other information.

P(B)is the marginal probability of event B, or how likely B is regardless of A.


This theorem provides a way to update the probability of a hypothesis (A) in light of new evidence (B).

Bayesian Interpretation:

The Bayesian interpretation is a way of understanding probability that treats it as a measure of belief or certainty about an event, rather than just a frequency of occurrence. In this view, probability is subjective and can change as new information becomes available. Here’s a breakdown of key aspects of the Bayesian interpretation:


1. Prior Probability (P(A)):

   - This represents the initial belief about the likelihood of an event (A) before considering any new evidence.

   - Example: Suppose you want to predict if it will rain today, based on historical data. If historically it rains 20% of the time, your prior belief (prior probability) is that there’s a 20% chance of rain.


2. Likelihood (P(B|A)):

   - This is the probability of observing the evidence (B) if the event (A) is true.

   - Example: If you observe dark clouds, you might estimate the likelihood that it will rain (A) given that there are dark clouds (B).


3. Posterior Probability (P(A|B)):

   - After observing new evidence (B), Bayes' theorem is used to update the prior probability into the posterior probability, reflecting your updated belief about event A after considering the evidence.

   - Example: If dark clouds are present (B), and you update your belief about the likelihood of rain (A), based on the likelihood of seeing dark clouds when it rains.


4. Bayesian Updating:

   - As new data or evidence becomes available, the Bayesian approach allows you to continuously update your beliefs (probabilities). This iterative process adjusts the likelihood of an event based on the evidence observed.

   - Example: You might initially believe it will rain (prior probability). If you see dark clouds (evidence), you increase the probability that it will rain (posterior probability). If you also hear thunder later, you further increase that probability.


Summary of Bayesian Interpretation:

- Probabilities reflect degrees of belief rather than fixed frequencies.

- Prior beliefs are updated as new evidence is encountered, leading to a more informed belief, called the posterior probability.

- This approach is flexible and applicable to decision-making in uncertain conditions, where new evidence is continuously gathered.


In contrast to frequentist interpretation (which views probability as the long-run frequency of an event), Bayesian probability is more about belief or uncertainty in light of both prior knowledge and new evidence.

Bayesian Network:

A Bayesian Network is a type of Probabilistic Graphical Model (PGM) used to represent and calculate uncertainty using probability. Often called Belief Networks, these networks use a Directed Acyclic Graph (DAG) to display relationships between variables.


In a DAG, nodes represent variables, and the directed links (arrows) show connections between them, indicating how one variable affects another. Unlike other types of graphs, a DAG has no loops, meaning you can't trace a path that circles back to the same node.




A Directed Acyclic Graph (DAG) helps model the uncertainty of events by showing how each event (or variable) depends on others. To show these dependencies, we use Conditional Probability Distributions (CPDs), which tell us how likely one event is given the occurrence of another. A Conditional Probability Table (CPT) is a simple way to organize these probabilities for each event in the graph.

Our website uses cookies to enhance your experience. Learn More
Accept !

GocourseAI

close
send