Vous êtes sur la page 1sur 5

NOT ALL DATA IS TIME-SERIES

2
NOT ALL DATA IS TIME-SERIES

 Recall from layer 4 that simulation of the LRA activities over the years, by using LAONA data, Parts I and II (PDF) and
data from external media sources, turned out to be time-series in nature. That is, the signals are generated in
successive steps.

 But not all data are of time-series nature. When the one data point (or set) is generated independent of the other,
then the time-order is not important.

 Knowing the nature of data under scrutiny (of time-series or non time-series) is important because the methods
employed for visualizing and/or processing the set of data, however, voluminous - depends on the nature of the
data.

 The good news is that elements of a dataset that is not of time-series nature can be understood a little more easily!
Well, not quite easily, but unlike the time-series data where care must be taken to take into account the
(immediate) prior patterns of data before a reasonable sense is made of the (chronologically) succeeding sub-
patterns, data(sets) generated independent of each other can be partitioned and processed in parallel!

 An example of such a dataset is a collection of essays on a specific topic submitted independently by entrants in a
writing contest. Unlike our LAONA sets of data-points (where a data-point corresponds to an email message as we
saw in the first few slides of this presentation), entries in the essay-writing contest (where each entry can be
regarded a data-point too)- can be visualized and understood (that is, judged) independently. And each essay is
generally about any size between two to ten times or so the size of a regular email from LAONA dataset.
NOT ALL DATA IS TIME-SERIES

Illustration of partitioning of smaller datasets, so that the partitions are processed in parallel. Parallel processing
reduces processing time, but requires more processing nodes!
NOT ALL DATA IS TIME-SERIES

The partitioned datasets are then processed in parallel by different processing nodes and the results combined from
the different processors combined in the manner so desired. Processing nodes, in the case of judging essay entries,
are the members comprising the team of judges. If there were 3000 essay entries and there were 10 judges, the
team could think of distributing the 3000 pieces even among the judges, so that each judge gets to read through and
assign scores to 3000/10=300 essay.

However, let’s look at another scenario where the 10 judges were pass judgment on just a single entry, but this time
a new 500-page fiction(novel)l. It would not be practical to assi the judges, however experienced they are, to

Vous aimerez peut-être aussi