Creation of datasets from open sources
Machine learning is one of the fastest growing spheres in IT, but it still has some fundamental problems. Before training a neural network, it's necessary to collect a vast dataset of marked entries. However, manual collection of information takes a lot of time and resources. That is why one of...
Saved in:
| Published in: | 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) pp. 295 - 297 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.01.2018
|
| Subjects: | |
| ISBN: | 9781538643396, 1538643391 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Machine learning is one of the fastest growing spheres in IT, but it still has some fundamental problems. Before training a neural network, it's necessary to collect a vast dataset of marked entries. However, manual collection of information takes a lot of time and resources. That is why one of the hardest problems to solve in deep learning is the problem of getting the right data with the proper tags. This paper aims at methods that allow to automatically create or update the marked dataset for building a car model classifier by the parser of known Internet sources, which uses a simple classifier to delete incorrect data. The main goal of this article is to prove that public sources can be used to collect the correctly selected and marked data. |
|---|---|
| ISBN: | 9781538643396 1538643391 |
| DOI: | 10.1109/EIConRus.2018.8317091 |

