July 14, 2000

The most popular tool for detecting chaos in experimental data is
calculation
of the correlation dimension using the method of Grassberger and
Procaccia
(Phys. Rev. Lett. **50**, 346-349 (1983)). The idea is to
construct
a function *C*(*r*) proportional to the probability
that two
arbitrary points on the orbit in state space are closer together
than *r*.
This is usually done by calculating the separation between every
pair of
*N*
data points and sorting them into bins of width *dr*
proportional
to *r*. The correlation dimension is given by *D*_{2}
= *d* log(*C*) / *d* log(*r*) in the limit *dr*
--> 0, *r* --> 0, and *N* --> infinity. These
limits are inherently
incompatible for a finite data set, and for many attractors, the
computed
value of *D*_{2} converges very slowly. In the
original
Grassberger-Procaccia paper, data (with *N* = 15,000) are
shown only
for the Henon map and Lorenz model, and it turns out that these
cases are
exceptional in their rapid convergence. For this reason, the
literature
is devoid of credible calculations of the correlation dimension
for most
model chaotic systems, including the logistic map, the Rossler
attractor,
and the Chirikov map.

To study convergence of standard chaotic systems it is necessary to generate large data sets, possibly exceeding the memory of the computer. Since new data can be generated inexpensively compared to the computational cost of performing the correlation calculation, a good strategy is to save the data in a circular buffer. Each new iterate replaces the oldest point in the buffer, which is then discarded. As each new point is added to the buffer, its separation from each of the previous points is calculated and sorted into bins. A few of the most recent points are excluded to avoid errors due to temporal correlation. The number of such points is small and can be estimated from the Lyapunov exponent. For example, with the Henon map, the largest Lyapunov exponent is known to be about 0.603 bits/iteration, and so after 133 iterations in 80-bit precision, no temporal correlations should remain. For the work reported here, a buffer of 32767 points was used, and the most recent 767 points were ignored.

The accuracy with which the correlation dimension can be
calculated
is dictated by the total number of pairs of data points that are
correlated.
With abundant data, there is thus a premium on calculating and
binning
the separations as quickly as possible. The Euclidean separation
between
two points (*x*_{1},
*y*_{1}) and (*x*_{2},
*y*_{2})
is calculated from *r* = [(*x*_{2
}- *x*_{1})^{2}
+ (*y*_{2} - *y*_{1})^{2}]^{1/2}.
It has been proposed to use other measures such as the absolute
norm, *r*
= |*x*_{2 }- *x*_{1}| + |*y*_{2}
- *y*_{1}|, or the maximum norm, *r* = max[|*x*_{2
}-
*x*_{1}|,
|*y*_{2} - *y*_{1}|], but these methods
tend
to give inferior results. Note, however, that it is not necessary
to perform
the square root; it is just as good to bin the values of *r*^{2}
as it is to bin the values of *r*, since log(*r*^{2})
= 2 log(*r*)*.*

With bins of width *dr* proportional to *r*, there
are several
strategies for sorting the separations into the appropriate bin.
The conceptually
simplest method is to generate a bin index by taking the integer
part of
log(*r*^{2}). Calculation of the logarithm typically
limits
the speed of this method. A clever scheme is to make use of the
fact that
computers store floating point numbers with a mantissa and an
integer exponent
of 2. Thus it is possible to find the integer part of log_{2}(*r*^{2})
by extracting the appropriate bits of the variable *r*^{2}.
This method automatically gives 2 / log_{10}(2) = 6.64
bins per
decade, which is reasonable and sufficient for our purposes.

As a numerical example, consider the logistic map *x _{n}*

Even with over 4 million fully correlated data points, the value
of
*D*_{2}
is still significantly below 1.0, but extrapolation of the least
squares
fit (the red line) to *r* = 0 gives a value of
*D*_{2}(0)
= 1.002185. In the plot above, the error bars are calculated
assuming the
statistical error for the number of points in each bin is equal to
the
square root of that number and that the fluctuations in the number
in adjacent
bins is uncorrelated. These assumptions are probably optimistic.

The reason for the slow convergence is the highly non-uniform
measure
on the unit interval. For the logistic map, the probability
distribution
is given by *P*(*x*) = 1/pi[*x*(1-*x*)]^{1/2}.
The infinite value of *P* at *x* = 0 and *x* = 1
is a result
of the parabolic nature of the map and the fact that a finite
interval
of *x* values near *x* = 0.5 map into the same point.
Thus we
would expect this kind of dependence for any unimodal 1-D map with
a quadratic
maximum. Presumably this dependence is universal for a wide class
of maps.

More generally we define *p* = log(*C*) and *q*
= log(*r*)
and suppose that *p*(*q*) is of the form *p*
= *A*
+ *D*_{2}(0)*q* + *f*(*q*), where *f*(*q*)
is a function whose derivative *df*/*dq* approaches
zero as *r*
--> 0. For the logistic equation, *f*(*q*) = log* _{e}*(-

System |
N |
B |
D_{2}(0) |

Logistic map (A = 4) |
4.2 x 10^{6} |
0.92 | 1.01 |

Logistic map (A = 3.9) |
1.1 x 10^{6} |
0.68 | 0.99 |

Logistic map (A = 3.6) |
1.3 x 10^{6} |
0.53 | 0.98 |

Sine map | 1.1 x 10^{6} |
0.83 | 1.00 |

2 logistic maps (A = 4) |
1.9 x 10^{6} |
1.35 | 1.97 |

3 logistic maps (A = 4) |
1.1 x 10^{6} |
1.81 | 2.92 |

Uniform random data | 9.0 x 10^{5} |
0.27 | 1.03 |

2-D uniform random data | 1.1 x 10^{6} |
0.30 | 2.04 |

Alpha = 0.5 map | 1.7 x 10^{6} |
0.48 | 1.06 |

Alpha = 1.5 map | 1.2 x 10^{6} |
0.37 | 1.03 |

Alpha = 3 map | 1.7 x 10^{6} |
2.04 | 0.82 |

Henon map (b = 0.3) |
3.3 x 10^{6} |
0.13 | 1.22 |

Henon map (b = 0.1) |
1.1 x 10^{6} |
0.58 | 1.13 |

Lozi map | 1.4 x 106 | 0.05 | 1.40 |

Chirikov map (k = 1) |
1.2 x 10^{6} |
-0.16 | 1.89 |

Arnold cat map | 1.1 x 10^{6} |
-0.04 | 1.99 |

Lorenz attractor | 2.3 x 10^{6} |
-0.27 | 2.02 |

Rossler attractor | 3.0 x 10^{6} |
0.94 | 2.01 |

Ueda attractor | 2.1 x 10^{6} |
0.50 | 2.68 |

Simplest quadratic | 2.2 x 10^{6} |
1.71 | 2.12 |

Simplest piecewise linear | 2.1 x 10^{6} |
2.95 | 2.25 |

References:

J. C. Sprott,
and G.
Rowlands, Improved
Correlation Dimension Calculation, International
Journal of Bifurcation and Chaos **11**, 1865-1880 (2001).

J. C. Sprott,
*Chaos and Time-Series Analysis*
(Oxford University Press, 2003), pp.329-336.

Back to Sprott's Technical Notes