Creation of datasets from open sources

Machine learning is one of the fastest growing spheres in IT, but it still has some fundamental problems. Before training a neural network, it's necessary to collect a vast dataset of marked entries. However, manual collection of information takes a lot of time and resources. That is why one of...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) s. 295 - 297
Hlavní autoři: Chugunkov, Ilya V., Kabak, Dmitry V., Vyunnikov, Viktor N., Aslanov, Roman E.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.01.2018
Témata:
ISBN:9781538643396, 1538643391
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Machine learning is one of the fastest growing spheres in IT, but it still has some fundamental problems. Before training a neural network, it's necessary to collect a vast dataset of marked entries. However, manual collection of information takes a lot of time and resources. That is why one of the hardest problems to solve in deep learning is the problem of getting the right data with the proper tags. This paper aims at methods that allow to automatically create or update the marked dataset for building a car model classifier by the parser of known Internet sources, which uses a simple classifier to delete incorrect data. The main goal of this article is to prove that public sources can be used to collect the correctly selected and marked data.
ISBN:9781538643396
1538643391
DOI:10.1109/EIConRus.2018.8317091