HTML
This chapter introduces the fundamentals of Hyper Text Markup Language (HTML) from the perspective of a web data collector. One can learn how to use browsers to display the source code of webpages and inspect specific HTML elements. The chapter develops the logic of markup languages in general and t...
Gespeichert in:
| Veröffentlicht in: | Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining S. 15 - 40 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Buchkapitel |
| Sprache: | Englisch |
| Veröffentlicht: |
Chichester, UK
John Wiley & Sons, Ltd
28.07.2014
|
| Schlagworte: | |
| ISBN: | 111883481X, 9781118834817 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | This chapter introduces the fundamentals of Hyper Text Markup Language (HTML) from the perspective of a web data collector. One can learn how to use browsers to display the source code of webpages and inspect specific HTML elements. The chapter develops the logic of markup languages in general and the syntax of HTML as a specific instance of a markup language. It presents the most important vocabulary in HTML. The chapter considers parsing— the process of reconstructing the structure and semantics of HTML documents—and how it helps to retrieve information from web documents. Start tags and end tags are also known as opening and closing tags. Tags are always enclosed by < and > to distinguish them from the content. Reserved characters are used for control purposes in a language. The chapter focuses on a subset of tags that are of special interest in the context of web data collection. |
|---|---|
| ISBN: | 111883481X 9781118834817 |
| DOI: | 10.1002/9781118834732.ch2 |

