2009 INTERNATIONAL CONFERNECE ON COMPUTATIONAL MODELS FOR LIFE SCIENCES (CMLS‐09)
Multistage Spatial Property Based Segmentation for Quantification of Fluorescence Distribution in Cells1210(2010); http://dx.doi.org/10.1063/1.3314268View Description Hide Description
The interpretation of the distribution of fluorescence in cells is often by simple visualization of microscope‐derived images for qualitative studies. In other cases, however, it is desirable to be able to quantify the distribution of fluorescence using digital image processing techniques. In this paper, the challenges of fluorescence segmentation due to the noise present in the data are addressed. We report that intensity measurements alone do not allow separation of overlapping data between target and background. Consequently, spatial properties derived from neighborhood profile were included. Mathematical Morphological operations were implemented for cell boundary extraction and a window based contrast measure was developed for fluorescence puncta identification. All of these operations were applied in the proposed multistage processing scheme. The testing results show that the spatial measures effectively enhance the target separability.
1210(2010); http://dx.doi.org/10.1063/1.3314266View Description Hide Description
Subcellular localization is a key functional characteristic of proteins. An automatic, reliable and efficient prediction system for protein subcellular localization is needed for large‐scale genome analysis. The automated cell phenotype image classification problem is an interesting “bioimage informatics” application. It can be used for establishing knowledge of the spatial distribution of proteins within living cells and permits to screen systems for drug discovery or for early diagnosis of a disease. In this paper, three well‐known texture feature extraction methods including local binary patterns (LBP), Gabor filtering and Gray Level Coocurrence Matrix (GLCM) have been applied to cell phenotype images and the multiple layer perceptron (MLP) method has been used to classify cell phenotype image. After classification of the extracted features, decision‐templates ensemble algorithm (DT) is used to combine base classifiers built on the different feature sets. Different texture feature sets can provide sufficient diversity among base classifiers, which is known as a necessary condition for improvement in ensemble performance. For the HeLa cells, the human classification error rate on this task is of 17% as reported in previous publications. We obtain with our method an error rate of 4.8%.
1210(2010); http://dx.doi.org/10.1063/1.3314267View Description Hide Description
Description, recognition and analysis biological images plays an important role for human to describe and understand the related biological information. The color images are separated by color reduction. A new and efficient linearization algorithm is introduced based on some criteria of difference chain code. A series of critical points is got based on the linearized lines. The series of curvature angle, linearity, maximum linearity, convexity, concavity and bend angle of linearized lines are calculated from the starting line to the end line along all smoothed contours. The useful method can be used for shape description and recognition. The analysis, decision, classification of the biological images are based on the description of morphological structures, color information and prior knowledge, which are associated each other. The efficiency of the algorithms is described based on two applications. One application is the description, recognition and analysis of color flower images. Another one is related to the dynamic description, recognition and analysis of cell‐cycle images.
1210(2010); http://dx.doi.org/10.1063/1.3314269View Description Hide Description
There are several algorithms for DNA sequence comparison, but it still remains a challenge. This paper presents a probabilistic way to analyze DNA sequences. Based on our assumption, a DNA sequence can be regarded as a Markov chain. Then we estimate its transition probabilities by using the maximum likelihood method. Motivated by the definition of relative entropy, we define a measure of the shared information between two Markov chains, improved relative Markov entropy. With a view to the impact of the di‐nucleotides’ distributions on the Markov entropy, we present two weighted measures, the weighted relative Markov entropy and weighted improved relative Markov entropy. Finally, we test these measures to analyze the similarities of the first exon of β‐globin gene of eleven different species. The reasonable result at the end of the paper verifies the validity of our measures. Moreover, the time complexity of our measures is favorable by comparing with that of the existing methods which solve the similar problem.
1210(2010); http://dx.doi.org/10.1063/1.3314270View Description Hide Description
The periodic occurrence of dinucleotides observed in nucleosomes has long been thought to be closely related to the sequence‐dependent helical anisotropy of DNA sequences. We conduct a statistical analysis on the structural characteristics of the dinucleotides containing the 10.5 periodicity and on those without that periodicity, using all nucleosomal structures published in the PDB. By categorizing performances on the distribution of value occurrences frequency, averaged value, averaged net value and their respective standard deviations, we give a detailed description as to the deformation preferences correlated with the periodicity for the 10 unique types of dinucleotides and summarize the possible roles of the significant steps in how they facilitate DNA bending. The results show that for some dinucleotides the stereochemical characteristics are highly periodicity‐dependent while for others the periodicity is less or not at all important for their conformations.
1210(2010); http://dx.doi.org/10.1063/1.3314271View Description Hide Description
In order to design life saving drugs, such as cancer drugs, the design of Protein or DNA structures has to be accurate. These structures depend on Multiple Sequence Alignment (MSA). MSA is used to find the accurate structure of Protein and DNA sequences from existing approximately correct sequences. To overcome the overly greedy nature of the well known global progressive alignment method for multiple sequence alignment, we have proposed two different algorithms in this paper; one is using an iterative approach with a progressive alignment method (PAMIM) and the second one is using a genetic algorithm with a progressive alignment method (PAMGA). Both of our methods started with a “kmer” distance table to generate single guide‐tree. In the iterative approach, we have introduced two new techniques: the first technique is to generate Guide‐trees with randomly selected sequences and the second is of shuffling the sequences inside that tree. The output of the tree is a multiple sequence alignment which has been evaluated by the Sum of Pairs Method (SPM) considering the real value data from PAM250. In our second GA approach, these two techniques are used to generate an initial population and also two different approaches of genetic operators are implemented in crossovers and mutation. To test the performance of our two algorithms, we have compared these with the existing well known methods: T‐Coffee, MUSCEL, MAFFT and Probcon, using BAliBase benchmarks. The experimental results show that the first algorithm works well for some situations, where other existing methods face difficulties in obtaining better solutions. The proposed second method works well compared to the existing methods for all situations and it shows better performance over the first one.
1210(2010); http://dx.doi.org/10.1063/1.3314272View Description Hide Description
This paper presents an overview of modeling light propagation through biological media by solving the photon transport equation. Different variants of the photon transport equation (PTE) are discussed. Several methods for modeling static distributions and the transient response are presented. A discussion on how to mix and match electromagnetic problems with the PTE is also provided.
Binary Classification using Decision Tree based Genetic Programming and Its Application to Analysis of Bio‐mass Data1210(2010); http://dx.doi.org/10.1063/1.3314262View Description Hide Description
In machine learning, pattern recognition may be the most popular task. “Similar” patterns identification is also very important in biology because first, it is useful for prediction of patterns associated with disease, for example cancer tissue (normal or tumor); second, similarity or dissimilarity of the kinetic patterns is used to identify coordinately controlled genes or proteins involved in the same regulatory process. Third, similar genes (proteins) share similar functions. In this paper, we present an algorithm which uses genetic programming to create decision tree for binary classification problem. The application of the algorithm was implemented on five real biological databases. Base on the results of comparisons with well‐known methods, we see that the algorithm is outstanding in most of cases.
1210(2010); http://dx.doi.org/10.1063/1.3314263View Description Hide Description
Given the current emphasis on research into human neurodegenerative diseases, an effective computing approach for the analysis of complex brain morphological changes would represent a significant technological innovation. The availability of mouse models of such disorders provides an experimental system to test novel approaches to brain image analysis. Here we utilize a mouse model of a neurodegenerative disorder to model changes to cerebellar morphology during the postnatal period, and have applied the GeoEntropy algorithm to measure the complexity of morphological changes.
Justification of Fuzzy Declustering Vector Quantization Modeling in Classification of Genotype‐Image Phenotypes1210(2010); http://dx.doi.org/10.1063/1.3314264View Description Hide Description
With the fast development of multi‐dimensional data compression and pattern classification techniques, vector quantization (VQ) has become a system that allows large reduction of data storage and computational effort. One of the most recent VQ techniques that handle the poor estimation of vector centroids due to biased data from undersampling is to use fuzzy declustering‐based vector quantization (FDVQ) technique. Therefore, in this paper, we are motivated to propose a justification of FDVQ based hidden Markov model (HMM) for investigating its effectiveness and efficiency in classification of genotype‐image phenotypes. The performance evaluation and comparison of the recognition accuracy between a proposed FDVQ based HMM (FDVQ‐HMM) and a well‐known LBG (Linde, Buzo, Gray) vector quantization based HMM (LBG‐HMM) will be carried out. The experimental results show that the performances of both FDVQ‐HMM and LBG‐HMM are almost similar. Finally, we have justified the competitiveness of FDVQ‐HMM in classification of cellular phenotype image database by using hypotheses t‐test. As a result, we have validated that the FDVQ algorithm is a robust and an efficient classification technique in the application of RNAi genome‐wide screening image data.
1210(2010); http://dx.doi.org/10.1063/1.3314265View Description Hide Description
Modern society increasingly relies on technology to support everyday activities. In the past, this technology has focused on automation, using computer technology embedded in physical objects. More recently, there is an expectation that this technology will not just embed reactive automation, but also embed intelligent, proactive automation in the environment. That is, there is an emerging desire for novel technologies that can monitor, assist, inform or entertain when required, and not just when requested. This paper presents three self‐motivated, home‐assistant bot applications using different self‐motivated agent models. Self‐motivated agents use a computational model of motivation to generate goals proactively. Technologies based on self‐motivated agents can thus respond autonomously and proactively to stimuli from their environment. Three prototypes of different self‐motivated agent models, using different computational models of motivation, are described to demonstrate these concepts.