Creation of datasets from open sources
Machine learning is one of the fastest growing spheres in IT, but it still has some fundamental problems. Before training a neural network, it's necessary to collect a vast dataset of marked entries. However, manual collection of information takes a lot of time and resources. That is why one of...
Uloženo v:
| Vydáno v: | 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) s. 295 - 297 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.01.2018
|
| Témata: | |
| ISBN: | 9781538643396, 1538643391 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Machine learning is one of the fastest growing spheres in IT, but it still has some fundamental problems. Before training a neural network, it's necessary to collect a vast dataset of marked entries. However, manual collection of information takes a lot of time and resources. That is why one of the hardest problems to solve in deep learning is the problem of getting the right data with the proper tags. This paper aims at methods that allow to automatically create or update the marked dataset for building a car model classifier by the parser of known Internet sources, which uses a simple classifier to delete incorrect data. The main goal of this article is to prove that public sources can be used to collect the correctly selected and marked data. |
|---|---|
| ISBN: | 9781538643396 1538643391 |
| DOI: | 10.1109/EIConRus.2018.8317091 |

