BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING: 22nd International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering

Frequency Estimation, Multiple Stationary Nonsinusoidal Resonances With Trend
View Description Hide DescriptionIn this paper, we address the problem of frequency estimation when multiple stationary nonsinusoidal resonances oscillate about a trend in nonuniformly sampled data when the number and shape of the resonances are unknown. To solve this problem we postulate a model that relates the resonances to the data and then apply Bayesian probability theory to derive the posterior probability for the number of resonances. The calculation is implemented using simulated annealing in a Markov chain Monte Carlo simulation to draw samples from this posterior distribution. From these samples, using Monte Carlo integration, we compute the posterior probability for the resonance frequencies given the model indicators as well as a number of other posterior distributions of interest. For a single sinusoidal resonance, the Bayesian sufficient statistic is numerically equal to the Lomb‐Scargle periodogram. For a nonsinusoidal resonance this statistic is a straightforward generalization of both the discrete Fourier transform and the Lomb‐Scargle periodogram. Finally, we illustrate the calculations using data taken from two different astrophysical sources.

A Bayesian Approach to Estimating Coupling Between Neural Components: Evaluation of the Multiple Component, Event‐Related Potential (mcERP) Algorithm
View Description Hide DescriptionAccurate measurement of single‐trial responses is key to a definitive use of complex electromagnetic and hemodynamic measurements in the investigation of brain dynamics. We developed the multiple component, Event‐Related Potential (mcERP) approach to single‐trial response estimation to improve our resolution of dynamic interactions between neuronal ensembles located in different layers within a cortical region and/or in different cortical regions. The mcERP model asserts that multiple components defined as stereotypic waveforms comprise the stimulus‐evoked response and that these components may vary in amplitude and latency from trial to trial. Maximum a posteriori (MAP) solutions for the model are obtained by iterating a set of equations derived from the posterior probability. Our first goal was to use the mcERP algorithm to analyze interactions (specifically latency and amplitude correlation) between responses in different layers within a cortical region. Thus, we evaluated the model by applying the algorithm to synthetic data containing two correlated local components and one independent far‐field component. Three cases were considered: the local components were correlated by an interaction in their single‐trial amplitudes, by an interaction in their single‐trial latencies, or by an interaction in both amplitude and latency. We then analyzed the accuracy with which the algorithm estimated the component waveshapes and the single‐trial parameters as a function of these relationships. Extensions of these analyses to real data are discussed as well as ongoing work to incorporate more detailed prior information.

Bayesian Estimation of Fish Disease Prevalence from Pooled Samples Incorporating Sensitivity and Specificity
View Description Hide DescriptionAn important emerging issue in fisheries biology is the health of free‐ranging populations of fish, particularly with respect to the prevalence of certain pathogens. For many years, pathologists focused on captive populations and interest was in the presence or absence of certain pathogens, so it was economically attractive to test pooled samples of fish. Recently, investigators have begun to study individual fish prevalence from pooled samples. Estimation of disease prevalence from pooled samples is straightforward when assay sensitivity and specificity are perfect, but this assumption is unrealistic. Here we illustrate the use of a Bayesian approach for estimating disease prevalence from pooled samples when sensitivity and specificity are not perfect. We also focus on diagnostic plots to monitor the convergence of the Gibbs‐sampling‐based Bayesian analysis. The methods are illustrated with a sample data set.

Using Bayesian Analysis and Maximum Entropy To Develop Non‐parametric Probability Distributions for the Mean and Variance
View Description Hide DescriptionEstimation of the population mean, and variance is generally carried out using sample estimates. Given normality of the parent population, the distribution of sample mean and sample variance is straightforward. However, when normality cannot be assumed, inference is usually based on approximations through the use of the Central Limit theorem. In addition, the data generated from many real populations may be naturally bounded, i.e. weights, heights, etc. Thus, the unbounded normal probability model may not be appropriate. Utilizing Bayesian analysis and maximum entropy, procedures are developed which produce nonparametric distributions for both the mean and the mean/standard deviation combination. These methods require no assumptions on the form of the parent distribution or the size of the sample and inherently make use of existing bounds.

Chernoff’s bound forms
View Description Hide DescriptionChernoff’s bound binds a tail probability (ie. Pr(X ⩾ a), where a ⩾ EX). Assuming that the distribution of X is Q, the logarithm of the bound is known to be equal to the value of relative entropy (or minus Kullback‐Leibler distance) for I‐projection P̂ of Q on a set H ≜ {P : E_{P}X = a}. Here, Chernoff’s bound is related to Maximum Likelihood on exponential form and consequently implications for the notion of complementarity are discussed. Moreover, a novel form of the bound is proposed, which expresses the value of the Chernoff’s bound directly in terms of the I‐projection (or generalized I‐projection).

Maximum Entropy Approach to a Mean Field Theory for Fluids
View Description Hide DescriptionMaking statistical predictions requires tackling two problems: one must assign appropriate probability distributions and then one must calculate a variety of expected values. The method of maximum entropy is commonly used to address the first problem. Here we explore its use to tackle the second problem. We show how this use of maximum entropy leads to the Bogoliuvob variational principle which we generalize, apply to density functional theory, and use it to develop a mean field theory for classical fluids. Numerical calculations for Argon gas are compared with experimental data.

Hierarchies of Models: Toward Understanding Planetary Nebulae
View Description Hide DescriptionStars like our sun (initial masses between 0.8 to 8 solar masses) end their lives as swollen red giants surrounded by cool extended atmospheres. The nuclear reactions in their cores create carbon, nitrogen and oxygen, which are transported by convection to the outer envelope of the stellar atmosphere. As the star finally collapses to become a white dwarf, this envelope is expelled from the star to form a planetary nebula (PN) rich in organic molecules. The physics, dynamics, and chemistry of these nebulae are poorly understood and have implications not only for our understanding of the stellar life cycle but also for organic astrochemistry and the creation of prebiotic molecules in interstellar space. We are working toward generating three‐dimensional models of planetary nebulae (PNe), which include the size, orientation, shape, expansion rate and mass distribution of the nebula. Such a reconstruction of a PN is a challenging problem for several reasons. First, the data consist of images obtained over time from the Hubble Space Telescope (HST) and spectra obtained from Kitt Peak National Observatory (KPNO) and Cerro Tololo Inter‐American Observatory (CTIO). These images are of course taken from a single viewpoint in space, which amounts to a very challenging tomographic reconstruction. Second, the fact that we have two disparate and orthogonal data types requires that we utilize a method that allows these data to be used together to obtain a solution. To address these first two challenges we employ Bayesian model estimation using a parameterized physical model that incorporates much prior information about the known physics of the PN. In our previous works we have found that the forward problem of the comprehensive model is extremely time consuming. To address this challenge, we explore the use of a set of hierarchical models, which allow us to estimate increasingly more detailed sets of model parameters. These hierarchical models of increasing complexity are akin to scientific theories of increasing sophistication, with each new model/theory being a refinement of a previous one by either incorporating additional prior information or by introducing a new set of parameters to model an entirely new phenomenon. We apply these models to both a simulated and a real ellipsoidal PN to initially estimate the position, angular size, and orientation of the nebula as a two‐dimensional object and use these estimates to later examine its three‐dimensional properties. The efficiency/accuracy tradeoffs of the techniques are studied to determine the advantages and disadvantages of employing a set of hierarchical models over a single comprehensive model.

Dirichlet Integral Principle For Elliptic Type Quasilinear PDEs of Irreversible Heat Conduction Process With Minimum Principles For First, Second And Third Type Boundary Conditions
View Description Hide DescriptionOnsager’s [1,2] and Prigogine’s [3,4] type minimum principles can be treated for irreversible processes in the frame of classical irreversible thermodynamics (CIT). Results agree with Gyarmati’s [5] integral principle. It is especially worthy to investigate the irreversible heat conduction process for the case of a stationary state for which new quasilinear elliptic type PDEs derived from the principles of minimal energy dissipation and minimum entropy production. Evaluating these PDEs through the aid of the Dirichlet Integral Principle yields the first, second and third type boundary condition solutions for each minimum principle. Here the interpretation of the Dirichlet Integral Principle essentially differs from the usually known “conservative” type approach using Laplace’s equation in conjunction with potential theory. Dissipation potentials of Rayleigh and Onsager type also agree with stated results. The evolution of the process towards a stationary state can be explained with the Glanssdorff‐Prigogine criterion. Boundary conditions of the fourth kind define the process of conduction between a single body, or system of bodies and their surroundings. The bodies are assumed to be in perfect contact where and when the surfaces in contact have the same temperature.

Bayesian analysis of magnetic island dynamics
View Description Hide DescriptionWe examine a first order differential equation with respect to time used to describe magnetic islands in magnetically confined plasmas. The free parameters of this equation are obtained by employing Bayesian probability theory. Additionally, a typical Bayesian change point is solved in the process of obtaining the data.

Learning in presence of input noise using the stochastic EM algorithm
View Description Hide DescriptionMost learning algorithms rely on the assumption that the input training data contains no noise or uncertainty. However, when collecting data under an identification experiment it may not be possible to avoid noise when measuring the input. The use of the errors‐in‐variable model to describe the data in this case is more appropriate. However, learning based on maximum likelihood estimation is far from straightforward because of the high number of unknown parameters. In this paper, to overcome the problems associated to the estimation with high number of unknown parameters, the nonlinear errors‐in‐variable estimation problem is treated under a Bayesian formulation. In order to compute the necessary maximum a posteriori estimate we use the restoration maximization algorithms where the true but unknown training inputs are treated as hidden variables. In order to accelerate the convergence of the algorithm a modified version of the stochastic EM algorithm is proposed. A simulation example on learning a nonlinear parametric function and an example on learning feedforward neural networks are presented to illustrate the effectiveness of the proposed learning method.

A Bayesian Classification Model for Real‐Time Intrusion Detection
View Description Hide DescriptionIntrusion‐detection systems (IDS) have been used as part of the security of information and communication technologies infrastructure because it is difficult to ensure that information systems are free from security flaws. In this paper we present a new design of an anomaly IDS. Design and development of the IDS are considered in our 3 main stages: normal behavior construction, anomaly detection and model update. A parametrical mixture model is used for behavior modeling from reference data. The associated Bayesian classification leads to the detection algorithm. A continuous model parameter re‐estimation is discussed as a possible heuristic for model update. Real‐time requirements are presented. Detection and update algorithms for the special case of Gaussian parametrical model are designed and evaluated with respect to their real‐time features in a PC‐like platform without any special hardware requirements. Experiments validating the model are presented as well.

A Bayesian Analysis of Recent Developments in Pattern Classification
View Description Hide DescriptionThe goal of Bayesian pattern classification is to state the probability that an object belongs to a particular class given observed values of attributes of the object. Recent research in this area has centered on the use of either mixture models or kernel‐based methods, such as Support Vector Machines and Relevance Vector Machines. We review this research and show how these areas are related to each other.

Statistical problems with weather‐radar images, I: Clutter identification
View Description Hide DescriptionA Markov Chain Monte Carlo (MCMC) procedure is presented for the identification of clutter in weather‐radar images. The key attributes of the image are the spatial coherence of the areas of clutter (noise) and cloud and the high spatial autocorrelation of the values in areas of cloud. A form of simulated annealing provides the possibility of fast clutter removal.

Statistical problems with weather‐radar images, II: Attenuation detection
View Description Hide DescriptionA procedure based on the combination of a Bayesian changepoint model and ordinary least squares is used to identify and quantify regions where a radar signal has been attenuated (i.e.diminished) as a consequence of intervening weather. A graphical polar display is introduced that illustrates the location and importance of the attenuation.

Wavelet Domain Image Separation
View Description Hide DescriptionIn this paper, we consider the problem of blind signal and image separation using a sparse representation of the images in the wavelet domain. We consider the problem in a Bayesian estimation framework using the fact that the distribution of the wavelet coefficients of real world images can naturally be modeled by an exponential power probability density function. The Bayesian approach which has been used with success in blind source separation gives also the possibility of including any prior information we may have on the mixing matrix elements as well as on the hyperparameters (parameters of the prior laws of the noise and the sources). We consider two cases: first the case where the wavelet coefficients are assumed to be i.i.d. and second the case where we model the correlation between the coefficients of two adjacent scales by a first order Markov chain. This paper only reports on the first case, the second case results will be reported in a near future The estimation computations are done via a Monte Carlo Markov Chain (MCMC) procedure. Some simulations show the performances of the proposed method.

What is a Question?
View Description Hide DescriptionA given question can be defined in terms of the set of statements or assertions that answer it. Application of logical inference to these sets of assertions allows one to derive the logic of inquiry among questions. There are interesting symmetries between the logics of inference and inquiry; where probability describes the degree to which a premise implies an assertion, there exists an analogous measure that describes the bearing or relevance that a question has on an outstanding issue. These have been extended to suggest that the logic of inquiry results in functional relationships analogous to, although more general than, those found in information theory. Employing lattice theory, I examine in greater detail the structure of the space of assertions and questions demonstrating that the symmetries between the logical relations in each of the spaces derive directly from the lattice structure. Furthermore, I show that while symmetries between the spaces exist, the two lattices are not isomorphic. The lattice of assertions is described by a Boolean lattice 2^{ N }, whereas the lattice of assuredly real questions is shown to be a sublattice of the free distributive lattice FD(N) = 2^{2 N }. Thus there does not exist a one‐to‐one mapping of assertions to questions, there is no reflection symmetry between the two spaces, and questions in general do not possess complements. Last, with these lattice structures in mind, I discuss the relationship between probability, relevance, and entropy.

Logical and Geometric Inquiry
View Description Hide DescriptionThis paper proposes a framework for quantifying logical and geometric inquiry through specific interpretations of Bayes’ Theorem and Information Theory. In logical inquiry there are a countable number of possible discrete answers that define the inquiry, and Bayes’ Theorem serves to move the observer posing the question along a trajectory in a hyberbolic figure in a manner suggested by Rodriguez. For N=3, this plane is a hyperbolic triangle whose angles sum to zero — the smallest possible value in the hyperbolic plane where the sum of the angles of a triangle must sum to a positive number less than pi. In euclidean space, the hyberbolic figure becomes a multi‐dimensional simplex or polyhedron described by Shannon in his paper on a geometrical perspective of channel capacity. A theory of geometric inquiry requires that one consider an observer who conjointly possesses an objective reality space Θ and a physical or measurable space X. It is discussed how the matching of these spaces characterizes the ability of an observer to distinguish its posited objective reality. A simple functional form I is suggested as a measure of the degree of distinguishability for an observer. This form corresponds to the trace of the Fisher information matrix of p(xθ) over θ∈ Θ. The origin and precise specification of the requirements that give rise to the specified functional form are unknown and represents an important area of future study with clues suggested in the work of Balasubramanian. At the same time, the question is asked regarding the nature of the metrics and probability distributions arising when an observer balances prior ignorance and prior knowledge through the extremizing of a functional J (p,∇p) = I + λH over probability densities p. The functional I is the a priori ability of the observer to distinguish pure space, H is the prior ignorance of the same observer over the same space, and λ is a scalar Lagrange multiplier ostensibly needed to balance units, but having additional interesting properties. Explicit solutions are derived for optimal p in both one and in general in N dimensions for λ = 0 and λ ≠ 0. In particular, the distributions that result when λ ≠ 0 include gaussian densities satisfying the functional form of distributions defining the elements of the Fisher Information matrix of pure‐space as discussed by Rodriguez which possesses negative curvature when spatial uncertainty exists. Although only inquiry is discussed, a formalized conjoint theory of inquiry and control has significant implications regarding the engineering and design of intelligent systems that operate cybernetically.

Yet Another Analysis of Dice Problems
View Description Hide DescriptionDuring the MaxEnt 2002 workshop in Moscow, Idaho, Tony Vignaux asked again a few simple questions about using Maximum Entropy or Bayesian approaches for the famous Dice problems which have been analyzed many times through this workshop and also in other places. Here, there is another analysis of these problems. I hope that, this paper will answer a few questions of Tony and other participants of the workshop on the situations where we can use Maximum Entropy or Bayesian approaches or even the cases where we can actually use both of them.

Information geometry and prior selection
View Description Hide DescriptionIn this contribution, we study the problem of prior selection arising in Bayesian inference. There is an extensive literature on the construction of non informative priors and the subject seems far from a definite solution [1]. Here we revisit this subject with differential geometry tools and propose to construct the prior in a Bayesian decision theoretic framework. We show how the construction of a prior by projection is the best way to take into account the restriction to a particular family of parametric models. For instance, we apply this procedure to the curved parametric families where the ignorance is directly expressed by the relative geometry of the restricted model in the wider model containing it.

An Entropy Approach for Utility Assignment in Decision Analysis
View Description Hide DescriptionA fundamental step in decision analysis is the elicitation of the decision‐maker’s preferences about the prospects of a decision situation in the form of utility values. However, this can be a difficult task to perform in practice as the number of prospects may be large, and eliciting a utility value for each prospect may be a time consuming and stressful task for the decision maker. To relieve some of the burden of this task, this paper presents a normative method to assign unbiased utility values when only incomplete preference information is available about the decision maker. We introduce the notion of a utility density function and propose a maximum entropy utility principle for utility assignment.