Story detection using generalized concepts and relations
A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Shallow parsers revea...
Uloženo v:
| Vydáno v: | 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) s. 942 - 949 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
ACM
25.08.2015
|
| Témata: | |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Shallow parsers reveal semantic roles of words leading to subject-verb-object triplets. We developed a novel algorithm to extract information from triplets by clustering them into generalized concepts by utilizing syntactic criteria based on common contexts and semantic corpus-based statistical criteria based on "contextual synonyms". We show that generalized concepts representation of text (1) overcomes surface level differences (which arise when different keywords are used for related concepts) without drift, (2) leads to a higher-level semantic network representation of related stories, and (3) when used as features, they yield a significant 36% boost in performance for the story detection task. |
|---|---|
| DOI: | 10.1145/2808797.2809312 |