Dataset statistics
| Number of variables | 1 |
|---|---|
| Number of observations | 13880 |
| Missing cells | 1283 |
| Missing cells (%) | 9.2% |
| Duplicate rows | 2473 |
| Duplicate rows (%) | 17.8% |
| Total size in memory | 216.9 KiB |
| Average record size in memory | 16.0 B |
Variable types
| TimeSeries | 1 |
|---|
Timeseries statistics
| Number of series | 1 |
|---|---|
| Time series length | 13880 |
| Starting point | 1983-01-01 00:00:00 |
| Ending point | 2020-12-31 00:00:00 |
| Period | 1 day |
| Dataset has 2473 (17.8%) duplicate rows | Duplicates |
Flow has 1283 (9.2%) missing values | Missing |
Flow is non stationary | Non stationary |
Flow is seasonal | Seasonal |
Reproduction
| Analysis started | 2024-05-12 18:18:43.779473 |
|---|---|
| Analysis finished | 2024-05-12 18:18:45.726078 |
| Duration | 1.95 second |
| Missing | Q_Station_NA_25027680_ok_Missing.csv |
| Download configuration | config.json |
Flow
Numeric time series
MISSING  NON STATIONARY  SEASONAL 
| Distinct | 5345 |
|---|---|
| Distinct (%) | 42.4% |
| Missing | 1283 |
| Missing (%) | 9.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5373.0043 |
|---|---|
| Minimum | 997.38 |
| Maximum | 11127 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 216.9 KiB |
Quantile statistics
| Minimum | 997.38 |
|---|---|
| 5-th percentile | 2334.2 |
| Q1 | 4160.8 |
| median | 5308.3 |
| Q3 | 6614 |
| 95-th percentile | 8346.8 |
| Maximum | 11127 |
| Range | 10129.62 |
| Interquartile range (IQR) | 2453.2 |
Descriptive statistics
| Standard deviation | 1810.3204 |
|---|---|
| Coefficient of variation (CV) | 0.3369289 |
| Kurtosis | -0.15153936 |
| Mean | 5373.0043 |
| Median Absolute Deviation (MAD) | 1228.3 |
| Skewness | 0.15155502 |
| Sum | 67683735 |
| Variance | 3277260.1 |
| Monotonicity | Not monotonic |
| Augmented Dickey-Fuller test p-value | 8.617728573 × 10-17 |
Histogram with fixed size bins (bins=50)
Gap statistics
| number of gaps | 29 |
|---|---|
| min | 5 days |
| max | 32 weeks and 3 days |
| mean | 6 weeks, 1 day and 1 hour |
| std | 6 weeks, 2 days and 59 minutes |
| Value | Count | Frequency (%) |
| 8040 | 20 | 0.1% |
| 5520 | 19 | 0.1% |
| 4662 | 16 | 0.1% |
| 7335 | 16 | 0.1% |
| 7350 | 15 | 0.1% |
| 4603 | 15 | 0.1% |
| 5638 | 15 | 0.1% |
| 6607 | 15 | 0.1% |
| 4310 | 15 | 0.1% |
| 4116 | 14 | 0.1% |
| Other values (5335) | 12437 | |
| (Missing) | 1283 | 9.2% |
| Value | Count | Frequency (%) |
| 997.38 | 1 | |
| 1002.9 | 1 | |
| 1009.4 | 1 | |
| 1011.4 | 2 | |
| 1012.4 | 1 | |
| 1020.7 | 1 | |
| 1021.4 | 1 | |
| 1063.3 | 1 | |
| 1071.9 | 1 | |
| 1078.1 | 1 |
| Value | Count | Frequency (%) |
| 11127 | 1 | < 0.1% |
| 11108 | 1 | < 0.1% |
| 11099 | 1 | < 0.1% |
| 11090 | 2 | < 0.1% |
| 11081 | 1 | < 0.1% |
| 11072 | 1 | < 0.1% |
| 11053 | 1 | < 0.1% |
| 11044 | 1 | < 0.1% |
| 11026 | 1 | < 0.1% |
| 11017 | 5 |
ACF and PACF
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| Flow | |
|---|---|
| Date | |
| 1983-01-01 | NaN |
| 1983-01-02 | NaN |
| 1983-01-03 | NaN |
| 1983-01-04 | NaN |
| 1983-01-05 | NaN |
| 1983-01-06 | NaN |
| 1983-01-07 | NaN |
| 1983-01-08 | NaN |
| 1983-01-09 | NaN |
| 1983-01-10 | NaN |
| Flow | |
|---|---|
| Date | |
| 2020-12-22 | 7383.0 |
| 2020-12-23 | 7332.3 |
| 2020-12-24 | 7280.6 |
| 2020-12-25 | 7210.8 |
| 2020-12-26 | 7157.3 |
| 2020-12-27 | 7093.9 |
| 2020-12-28 | 7048.6 |
| 2020-12-29 | 6983.3 |
| 2020-12-30 | 6905.5 |
| 2020-12-31 | 6849.5 |
Most frequently occurring
| Flow | # duplicates | |
|---|---|---|
| 2472 | NaN | 1283 |
| 2275 | 8040.0 | 20 |
| 1342 | 5520.0 | 19 |
| 879 | 4662.0 | 16 |
| 2110 | 7335.0 | 16 |
| 719 | 4310.0 | 15 |
| 847 | 4603.0 | 15 |
| 1396 | 5638.0 | 15 |
| 1846 | 6607.0 | 15 |
| 2113 | 7350.0 | 15 |