Physiological signals are both non-stationary and non-linear. The need for non-linear methods of measuring the complex nature of the signal is therefore vital to gain the full nature of the physiological state. Entropy gives the researcher information concerning the regularity of a signal.1 While irregularity may be described with some accuracy, the meaning behind it has previously been difficult to accurately ascertain.
Researchers have tested the prior standard entropy values to separate irregularity into categories of randomness versus complex coordination underlying a measured signal.2 Studies of arrythmia subjects and healthy controls have shown questionable results, often failing to differentiate the chaotic nature of the arrythmia ECG signals from those of young healthy subjects.3, 4 As arrythmia is an example of a poorly adapting system with highly random interbeat intervals, a scientifically reliable entropy metric must show ability to differentiate this chaotic signal from healthy subjects, as well as from elderly subjects, which are on the opposing end of the regularity spectrum, exhibiting rigid periodicity.
Until recently it has only been possible to show greater levels of irregularity. It was impossible to determine if a change in irregularity was a positive or a negative for the living system. Using the three entropy metrics provided by this website, it is now possible to understand the level of healthy coordination in a subject. This is of utmost importance in pre-/post-intervention research, as it can now be shown if an intervention offers an increased level of healthy coordination in subjects.
It also allows a window to comprehend why some medically-diagnosable states, such as nocturnal enuresis, eating disorders, or duodenal ulcers tend to exhibit a higher HRV than controls groups free of those issues. Our own pilot investigation has shown that Chiropractic intervention reduces HRV in enuretics, but increases complexity. This suggests that increased HRV in these unhealthy states may actually be irregularity of a more chaotic and random coordinative level of lost health. Differentiating rigid periodicity, from complex variability, from tending toward randomness, is a research opportunity that allows new calculations to be done, even on prior research time-series publications, that would quite easily provide new publications and contribute to a newly growing body of data.
The inability to distinguish complexity from randomness is an accuracy weakness of prior entropy metrics that we have called into question earlier. A second failing is the common need for signal samples of up to many hours in length to provide reliable results for heart rate dynamics.5,6,7 The need for entropy measures requiring far shorter samples was obvious. Samples of < 5 minutes are more ideal for both simplicity of recordings, and decreased opportunities for artifacts to occur. Early testing of the metrics we offer on this site have shown a very similar entropy results down to lower than fifty consecutive intervals. Other methods using time-series gathering, such as EEG, showed good results in testing here as well. EEG samples of only 24 seconds may have over 4,000 samples.
Distribution Entropy (DistEn) offers a new view of control of physiological coordination. It was first described in 2015 as a means to alleviate the weaknesses found using prior entropy metrics.8 Previous entropy measures depend upon the parametric constraints of "r", the tolerance of inter-vector distance. It has been shown that r is very susceptible to errors of estimation.9 Small alterations of r in their calculation cause susceptibility toward massive error.
This renders Approximate Entropy, Sample Entropy, and Fuzzy Entropy (all based upon Kolmogorov entropy) of little value without extremely long sample lengths of >15 minutes, if at all.10,11
This problem is addressed by Distribution Entropy. Based upon Shannon entropy, it eliminates the tolerance issue—r—by using differing variables that are far less susceptible to estimation issues. DistEn is a function of three parameters; data length N, embedding dimension m and number of bins M used in the probability distribution. Altering choice of embedding dimension (m) and bin number (M) have been shown to be far less influential in changing results. This renders this algorithm quite stable and reliable.
Distribution entropy has shown good reliability, and the ability to separate known arrythmia signals from controls with a short as <1 minute of recording.12
The DistEn algorithm first require specifying values for both m and τ (time delay factor). The specific choices of embedding dimension (m) and time delay (τ) determine the appropriateness of the state space reconstruction of a time-series. They both are important determining factors for DistEn. In some studies13,14 of heart rate interbeat intervals, it is common to set m=2 and τ=1. It has been suggested that these settings may not be as fitting for time-series like EEG signals, due to different sampling frequencies varying from hundreds of Hz to several thousand Hz.15 Gathering data at ranging sampling frequencies will lead to altered oscillation attributes that will influence the determination of m and τ.16,17
Guatma put forth research of the parameters for typical signals researchers commonly use.18 Other researchers have done extensive work to offer parameter suggestions for calculations of Distribution Entropy.19 The table below offers suggested ranges for accurate calculation of common signals that this calculator would provide results of entropy with greatest likelihood of accuracy. This calculator allows the researcher uploading their file to select a value they desire to use, whether from the table below or not.
As relates to all three of the entropy measures available here, most common entropy metrics identify randomness, but not complexity. These three entropy types we offer for use reach beyond that limitation to define complexity itself. This makes them unique in the realm of entropy to this point.
Phase entropy uses mathematical scatter plotting of a Cartesian axis style to place each interval of a time series in one of four quadrants. Placement depends only on the dynamics of a single interval in a common Poincaré plot. The beat is either an acceleration or deceleration from the previous interval. Unlike the Poincaré plot, the four-quadrant phase entropy plot compares the preceding and following intervals in series. Plotting via phase entropy allows an understanding of the rate of variability, not just the degree of variability as the Poincaré plot is limited to unveiling.20 By doing so, phase entropy allows the visualization of both linear and non-linear dynamics.
It is important to note that phase entropy has not been broadly researched and published on for various signal types. Distribution entropy has been tested using many various signals, as it has been over five years since it was developed. Phase entropy is merely a single year old at this writing.
The parameter "k" used here represents the number of divisions of the plot. As there are four quadrants, multiples of four should be used. The original authors reported that k found best results at k > 15. Therefore, 16 is their recommendation.
While multiscale entropy methods (MSE) have been used considerably in published research, investigation has shown MSE unreliable at quantifying HRV. MSE also requires a rather lengthy time series to achieve results.
To improve both utility and accuracy, an alternative method to calculate distribution entropy on multiple temporal scales by using a moving average system has been developed. This new method, multiscale distribution entropy (MSDE), can therefore solve the inherent sequence length issue of the coarse-graining methods used in calculation of ordinary MSE. By using portions of the time series not regarded in MSE shorter segments provide more plentiful calculations. This technique is illustrated below.
As illustrated, coarse-graining compares individual pairs at Scale 2. In the twenty-sample example (A), this method offered ten measurements. In example (B), using moving-averaging, the second half of each sample pair is used in the following pair, therefore revealing nineteen pairs from the same twenty sample set. Shorter sample sizes can arrive at larger conclusions than using typical coarse-graining, such as multiscale entropy (MSE). For this reason, MSE requires quite long samples to find accurate results.
It can also be seen that as the scale number increases from 1-20, incredibly long time-series samples would be required using coarse-graining methods to achieve the number of data calculations derived by moving-averaging. Comparing data samples in sets of five, let alone the scale factor of twenty, will begin to limit the data derived from coarse-graining (C) compared to the methods used in MSDE (D). The former compares only four sets of data from the twenty interval samples, while the latter derives sixteen sets of data from those same twenty samples.
While MSDE has a relatively short history of use and publication of research results, the results were quite impressive.
Their study showed MSDE was able to differentiate the three groups of subjects—young, elderly, and congestive heart failure—with p values from 0.008 to 8.88 × 10-11, using samples of 100 intervals. These samples were recorded at 125 Hz, far less than the 250 Hz samples currently being recommended. It has been shown that lower frequency samples may have inherent weakness in entropy calculation.22
As this research was published March 2020, only ECG-derived interbeat intervals (IBIs) were tested thus far. Logical deduction would assume that MSDE may very well yield promising accuracy in other time series, especially those having been verified using the original distribution entropy.
The parameter variables are the same as those for distribution entropy. Embedding dimension (m) and the number of bins to [m = 2, bins = 512] was used in the lone published study. Further efforts may consider varying those parameters to better fuel further research and understanding of this promising entropy measure capable of determining complexity.
Time Series Type
|Entropy Ratio Estimation Method||TDMI/FNN Selection Method|
|M(bin number)||m(embedding dimension)||τ(time delay)||m(embedding dimension)||τ(time delay)||k(multiple of 4 recommended)|
|ECG||128+ (256, 512, 1024, ...) 23||5 or 2-5 24||2||6||10||16|
|EEG||5||9||7||11||(yet to be determined)|
If this service has helped you, please consider supporting this work in any of the following ways below:
Help Us Keep this Entropy Calculator Free!
The project to create this site, as well as converting the algorithms to compatible code was led by our data scientist, Gábor Balló. He is an accomplished scientist and educator, living in Denmark. He received his PhD in Mathematical Physics. If you have occasion to need a sharp data scientist, here is a link to discuss your data concepts with him.
We must also offer sincere thanks to the research nonprofit, The Center for Chiropractic Progress and their president, Dr. Lance Lorfeld. Upon hearing of our project, The Center offered a grant to fully fund the creation of these vital algorithms. It is rare to find an organization that is as transparent and focused on always doing what must be done. If you find this site helpful, which you will, consider following this link to see what they are about, and donate to this deserving nonprofit. Dollar for dollar there is no group I know of that accomplishes as much.