| [back]
Networks,
Evolution, Science & Neural Systems
Alex Bäcker
Abstract.
Recent
times have seen the advent of large amounts of data on networks of diverse
kinds, from the WWW and citation networks to protein and gene expression
networks. Part of my work has been aimed at extracting insight out of
these massive collections of data. We show, for example, that recent
years have seen an expansion in the memory of science and a homogenization
of citation distributions. In parallel, I have been developing mathematical
methods to extract information from multi-neuron recordings of brain
activity. More generally, I am addressing a variety of open questions
at the interface of biology, math and computation.
Research Summary
Software as Networks
I have developed a software tool that constructs graphs to represent
a collection of software programs, where each node represents a program
and each edge represents a call from one program to another. I have
calculated statistical properties of several of these graphs.
A Revolution in Scientific Production (with Kevin Boyack)
Analyzing data from the entire Science Citation Index (SCI) and historic
science funding levels, Kevin Boyack and I discovered that scientific
production, measured in published papers indexed by the SCI, underwent
a dramatic change in growth pattern shortly after 1960, changing from
exponential growth to linear growth with a slope much higher than that
of the exponential at the discontinuity point. We have traced this change
to a rather abrupt increase in U.S. science funding that has continued
to increase linearly to this day. The slow exponential growth of the
early days is consistent with growth driven by each professor taking
on a fixed number of apprentices in his life. Modern science shows instead
growth that appears to be limited by the funding levels rather than
by the number of academicians available to teach the profession.
The Expanding Memory of Science (with Kevin Boyack)
Kevin Boyack and I discovered that the memory of science, measured by
the age of citations, has been growing in the past several years. We
have further disambiguated the effects of memory expansion due to aging
of the scientific literature, that due to growth of science, and that
which cannot be explained by either growth or aging.
The Socializing Effect of the WWW on Citation Distributions
I have obtained interesting results analyzing the structure of citation
graphs, using appropriate normalizations to show that the Internet has
a socializing or equalizing effect on citation distributions, and that
articles available online are 5 times as likely to be cited as those
which are not, even within the same journal. I hope to submit this to
a general-interest journal soon.
A Mathematical Formulation for Curiosity
I have developed a mathematical formulation for curiosity that may be
useful to guide exploration of robots and machine learning algorithms
to solve a well-known problem that autonomous robots tend to cluster
around what they know best after they have found rewarding stimuli.
The formulation remains to be tested in simulations or robot runs. The
basic idea is the following: Learning behavior should seek to maximize
not prediction success, but change of predictions. To do this, it should
venture into spaces where it has very poor accuracy of prediction. This
can be stimulated automatically by rewarding positive increases in prediction
confidence. Confidence can be modeled simply by abs(predicted prob.
of event occurring - 0.5).
Higher-order Correlations (with Elebeoba May)
Correlation, like distance, has remained largely a measure defined between
two variables ever since Sir Francis Galton introduced the concept in
1888 (Galton, 1888). Even multivariate correlational analysis relies
on computing the correlation between two (and only two) variables: an
unmodified variable and a composite variable consisting of the weighted
sum of 2 or more variates (DuBois, 1957). And yet pairwise correlations
do not uniquely characterize the interactions between a set of N variables,
and it is an important problem in data mining and many fields of science
to determine whether additional interactions exist. I aim to provide
a computationally tractable definition or approximation of the N-way
independence (maximum entropy or lack of structure beyond lower-order
correlations) between N variables, with particular emphasis on binary
variables such as neuronal spike trains. Iterative algorithms for the
calculation of the maximum entropy distributions exist (Gokhale and
Kullback, 1978), but their implementation for N>20 is out of reach
for even the fastest supercomputers (Bohte et al., 2000). We are implementing
a fast algorithm for the detection of higher order correlations, using
a generalization of the cross-product ratio.
Earthshine: A Visual Illusion
I have described a novel visual illusion concerning the Moon –
observers perceive the illuminated part as belonging to a circle of
greater diameter than the dark part (Earthshine).
Cellular automata and Randomness
Proved wrong a contention of Stephen Wolfram’s new book, A New
Kind of Science, about the generation of apparent randomness by simple
automata with simple initial conditions by showing statistical regularities
exist in the binary sequence produced by rule 30 in the central column.
A Robust Measure for Spike Timing Jitter
Measuring the precision of spike timing under dynamic conditions is
problematic due to the difficulty of ascribing matches between spikes
in the face of dropped spikes. Existing methods draw an arbitrary cutoff
for the maximum allowed separation between matching spikes. We have
developed a measure robust to variations in the choice of such maximum
separation.
A Measure for Collective Synchronization
Our previous work has shown neuronal synchronization is crucial to the
decoding of information in neural assemblies. Yet methods to measure
non-oscillatory synchronization in multineuron recordings are lacking.
We propose the variance of the spike count across a neuronal assembly
as a measure for collective synchronization.
Ecological Determinants of Stable Equilibria with Multiple Coexisting
Genotypes
Perhaps the oldest mystery of evolution and ecology is the origin of
biodiversity. Different species can co-exist stably in the same niche
for ages. In contrast, all intermediates between our last common ancestor
with the great apes have disappeared. Additionally, all species examined
exhibit a striking abundance of polymorphisms. We have put forth a theory
to explain biodiversity.
|