A vector method for finding sequences in big data
A technological software solution is proposed for metric search and identification of logical-temporal patterns of a business data flow by creating additional vector data structures and a parallel method for their processing. The subject of research is the methods of searching and identifying logica...
Uloženo v:
| Vydáno v: | Сучасні інформаційні системи Ročník 6; číslo 3; s. 13 - 22 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
14.09.2022
|
| ISSN: | 2522-9052 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | A technological software solution is proposed for metric search and identification of logical-temporal patterns of a business data flow by creating additional vector data structures and a parallel method for their processing. The subject of research is the methods of searching and identifying logical-temporal patterns in big data. The purpose of the study is to increase the efficiency of searching and recognizing logical-temporal patterns that semantically form business functionality in an 8-hour frame of screenshots with "garbage" data. Applied methods: apparatus of set theory and Boolean algebra, metric models for determining parameters for sets of binary vectors, elements of probability theory, theory of algorithms, software modeling. The results obtained: a method for searching and recognizing patterns based on a vector problem of character sequences that identify patterns in big data streams using unitary coding of information primitives and data; vector models are unitary-encoded data structures for describing a big data flow as Cartesian products of a set of primitive-string-markers and a discrete sequence of implementation of a given time frame. The practical significance of the work: the implementation of the vector method, which made it possible to create a pattern recognition program in a big data stream with a probability of 0.77%. |
|---|---|
| ISSN: | 2522-9052 |
| DOI: | 10.20998/2522-9052.2022.3.02 |