Extracting Databases from Dark Data with DeepDive
DeepDive is a system for extracting relational databases from : the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data - scientific papers, Web classified ads, customer service notes, and s...
Uložené v:
| Vydané v: | Proceedings - ACM-SIGMOD International Conference on Management of Data Ročník 2016; s. 847 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
United States
01.06.2016
|
| ISSN: | 0730-8078 |
| On-line prístup: | Zistit podrobnosti o prístupe |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | DeepDive is a system for extracting relational databases from
: the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data - scientific papers, Web classified ads, customer service notes, and so on - were instead in a relational database, it would give analysts a massive and valuable new set of "big data." DeepDive is distinctive when compared to previous information extraction systems in its ability to obtain very high precision and recall at reasonable engineering cost; in a number of applications, we have used DeepDive to create databases with accuracy that meets that of human annotators. To date we have successfully deployed DeepDive to create data-centric applications for insurance, materials science, genomics, paleontologists, law enforcement, and others. The data unlocked by DeepDive represents a massive opportunity for industry, government, and scientific researchers. DeepDive is enabled by an unusual design that combines large-scale probabilistic inference with a novel developer interaction cycle. This design is enabled by several core innovations around probabilistic training and inference. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 0730-8078 |
| DOI: | 10.1145/2882903.2904442 |