Have a personal or library account? Click to login
Efficient Stratified Sampling Graphing Method for Mass Data Cover

Efficient Stratified Sampling Graphing Method for Mass Data

Open Access
|Nov 2019

Figures & Tables

dsj-18-1042-g1.png
Figure 1

Example of efficient stratified sampling graphing. The black line represents a full dataset that is divided into eight subsets. The red line represents the sampled graph that links the maximum and minimum values (red dots) of each subset in their order of appearance.

dsj-18-1042-g2.png
Figure 2

Flow chart of the efficient stratified sampling graphing method.

dsj-18-1042-g3.png
Figure 3

Magnified polyline graph.

dsj-18-1042-g4.png
Figure 4

Comparison of full dataset graphs (a) and sampled graphs (b, c) when the full data are divided into 2X (b) and 1X (c) subsets in 8,640,000 points.

dsj-18-1042-g5.png
Figure 5

Similarity of nX subsets for sampling with (a) the pixel contrast method, (b) curve area method, and (c) envelope line method. (d) shows the average similarity of 6,000, 60,000, 120,000, 600,000, 2,000,000, and 8,640,000 points.

Table 1

Average similarity of the sampled and full dataset graphs.

MethodAverage similarity (%) of nX subsets for sampling
1X2X3X4X5X6X8X10X20X30X40X50X
Pixel contrast method96.5899.5498.9499.7599.3999.8499.8999.9199.9499.9599.9599.96
Curve area method96.0298.9398.1699.1798.7899.4799.5899.6499.7799.8199.8399.89
Envelope line method97.1499.2598.7099.4299.1499.6399.7099.7499.7999.8699.8899.92
Average96.5899.2498.6099.4499.1199.6599.7299.7699.8399.8799.8999.93

[i] Note: The width and height of the graphing window for the similarity test are 908 and 200 pixels, respectively. When the full dataset is divided into 1X, 3X, and 5X subsets, the similarity is lower; such items are marked in light gray.

Table 2

Sampling graphing speed test result.

Number of channelsFull dataset capacity (MB)Full dataset graphing time (s)Single-thread sampling + graphingthreadPool sampling + graphing
Sampling time (s)Graphing time (s)Total time (s)Raising rateSampling time (s)Graphing time (s)Total time (s)Raising rate
399460.550.180.7363.00.140.200.34135.3
3611865466.431.347.7770.31.061.252.31236.4
722372107213.042.6915.7368.21.982.514.49238.8

[i] Note: Graphing employs threadPool technology, with one thread per channel; Total time = sampling time + graphing time; Raising rate = Full dataset graphing time/total time.

dsj-18-1042-g6.png
Figure 6

Concept of gradual scaling control appropriate for sampled graphs.

Language: English
Submitted on: Jul 16, 2019
|
Accepted on: Oct 31, 2019
|
Published on: Nov 13, 2019
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2019 Jianjun Wang, Yingang Zhao, Jun Chen, Suqing Zhang, Xudong Zhao, Yufei He, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.