Výsledky vyhľadávania - Web Data Extraction and Crawling Techniques

  1. 1

    A New Framework for Domain-Specific Hidden Web Crawling Based on Data Extraction Techniques Autor El-Desouky, A.I., Ali, H.A., El-Ghamrawy, S.M.

    ISBN: 0780397703, 9780780397705
    ISSN: 2329-6364
    Vydavateľské údaje: IEEE 01.12.2006
    “…% of the content on the Web, this portion of Web called Hidden Web (HW), they are "Hidden" in databases behind search interfaces…”
    Získať plný text
    Konferenčný príspevok..
  2. 2

    Scrimmo: A Real-Time Web Scraper Monitoring the Belgian Real Estate Market Autor Barzin, Felix, Yernaux, Gonzague, Vanhoof, Wim

    Vydavateľské údaje: IEEE 26.10.2023
    “…Web scraping (or Web crawling), a technique for automated data extraction from websites, has emerged as a valuable tool for scientific research and data analysis…”
    Získať plný text
    Konferenčný príspevok..
  3. 3

    Enabling maps/location searches on mobile devices: constructing a POI database via focused crawling and information extraction Autor Chuang, Hsiu-Min, Chang, Chia-Hui, Kao, Ting-Yao, Cheng, Chung-Ting, Huang, Ya-Yun, Cheong, Kuo-Pin

    ISSN: 1365-8816, 1362-3087, 1365-8824
    Vydavateľské údaje: Abingdon Taylor & Francis 02.07.2016
    “… However, manual annotation is costly and limited in current POI search services. With the abundance of information on the Web, many store POIs can be extracted from the Web…”
    Získať plný text
    Journal Article
  4. 4

    Swarm-intelligence-based extraction and manifold crawling along the Large-Scale Structure Autor Awad, Petra, Peletier, Reynier, Canducci, Marco, Smith, Rory, Taghribi, Abolfazl, Mohammadi, Mohammad, Shin, Jihye, Tiňo, Peter, Bunte, Kerstin

    ISSN: 0035-8711, 1365-2966, 1365-2966
    Vydavateľské údaje: London Oxford University Press 01.04.2023
    “…) on N-body cosmological simulation data of the Cosmic Web. The 1-DREAM toolbox consists of five Machine Learning methods, whose aim is the extraction and modelling…”
    Získať plný text
    Journal Article
  5. 5

    Web Scraping or Web Crawling: State of Art, Techniques, Approaches and Application Autor Khder, Moaiad

    ISSN: 2710-1274, 2074-8523
    Vydavateľské údaje: 30.12.2021
    “…Web scraping or web crawling refers to the procedure of automatic extraction of data from websites using software…”
    Získať plný text
    Journal Article
  6. 6

    Developing an automated framework for eco-label information categorization using web crawling and Natural Language Processing techniques Autor Nguyen, Ho Anh Thu, Pham, Duy Hoang, Kim, Byeol, Ahn, Yonghan, Kwon, Nahyun

    ISSN: 0957-4174
    Vydavateľské údaje: Elsevier Ltd 05.07.2025
    Vydané v Expert systems with applications (05.07.2025)
    “… This study explores the application of web crawling techniques, Natural Language Processing (NLP…”
    Získať plný text
    Journal Article
  7. 7

    Keyword weight optimization using gradient strategies in event focused web crawling Autor Rajiv, S, Navaneethan, C

    ISSN: 0167-8655, 1872-7344
    Vydavateľské údaje: Amsterdam Elsevier B.V 01.02.2021
    Vydané v Pattern recognition letters (01.02.2021)
    “…•A web crawling system for obtaining the set of web data regarding key events is essential…”
    Získať plný text
    Journal Article
  8. 8

    An ontology-driven multimedia focused crawler based on linked open data and deep learning techniques Autor Capuano, Andrea, Rinaldi, Antonio M., Russo, Cristiano

    ISSN: 1380-7501, 1573-7721
    Vydavateľské údaje: New York Springer US 01.03.2020
    Vydané v Multimedia tools and applications (01.03.2020)
    “… In this article we propose a novel approach to focused crawling based on the use of both textual and multimedia web page content…”
    Získať plný text
    Journal Article
  9. 9

    Web crawling based context aware recommender system using optimized deep recurrent neural network Autor Boppana, Venugopal, Sandhya, P.

    ISSN: 2196-1115, 2196-1115
    Vydavateľské údaje: Cham Springer International Publishing 20.11.2021
    Vydané v Journal of big data (20.11.2021)
    “… Majorly, content and collaborative filtering techniques are employed in typical recommendation systems to find user preferences and provide final recommendations…”
    Získať plný text
    Journal Article
  10. 10

    Deep Web crawling: a survey Autor Hernández, Inma, Rivero, Carlos R., Ruiz, David

    ISSN: 1386-145X, 1573-1413
    Vydavateľské údaje: New York Springer US 01.07.2019
    Vydané v World wide web (Bussum) (01.07.2019)
    “…Deep Web crawling refers to the problem of traversing the collection of pages in a deep Web site, which are dynamically generated in response to a particular query that is submitted using a search form…”
    Získať plný text
    Journal Article
  11. 11

    Collecting data on textiles from the internet using web crawling and web scraping tools Autor Muehlethaler, Cyril, Albert, René

    ISSN: 0379-0738, 1872-6283, 1872-6283
    Vydavateľské údaje: Ireland Elsevier B.V 01.05.2021
    Vydané v Forensic science international (01.05.2021)
    “… It has become more affordable for researchers who can now devote most of their time to extracting meaningful information from the structured data…”
    Získať plný text
    Journal Article
  12. 12

    Bot crawler to retrieve data from Facebook based on the selection of posts and the extraction of user profiles Autor Sánchez Paipilla, Ariel Guillermo, Durán Vaca, Mónica Katherine, Ballesteros Ricaurte, Javier Antonio, González Amarillo, Angela María, López, Pedro Nel

    ISSN: 0122-6517, 2382-4700, 2382-4700
    Vydavateľské údaje: 20.09.2022
    Vydané v Inge Cuc (20.09.2022)
    “…— The main objective of this work is to development of a Bot Crawler, which allows extracting information from Facebook without access restrictions, or request for credentials, based on web crawling and scraping techniques…”
    Získať plný text
    Journal Article
  13. 13

    Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling Autor Tokarchuk, Laurissa, Wang, Xinyue, Poslad, Stefan

    ISSN: 1932-6203, 1932-6203
    Vydavateľské údaje: United States Public Library of Science 06.11.2017
    Vydané v PloS one (06.11.2017)
    “… Existing Twitter event monitoring systems for sub-event detection and summarization currently typically analyse events based on partial data as conventional data collection methodologies are unable…”
    Získať plný text
    Journal Article
  14. 14

    Aplikasi Deteksi Motif dan Crawling Produk Batik Banyuwangi Berbasis Web Autor Hakim, Lutfi, Novitasari, Nurul Hidayati, Kristanto, Sepyan Purnama, Yusuf, Dianni

    ISSN: 2301-7988, 2581-0588
    Vydavateľské údaje: LPPM ISB Atma Luhur 29.12.2022
    Vydané v Jurnal Sisfokom (29.12.2022)
    “… Therefore, in this study, an application was developed that can detect via web devices and also extract information related to Batik Banyuwangi products in the marketplace…”
    Získať plný text
    Journal Article
  15. 15

    xCrawl: a high-recall crawling method for Web mining Autor Shchekotykhin, Kostyantyn, Jannach, Dietmar, Friedrich, Gerhard

    ISSN: 0219-1377, 0219-3116
    Vydavateľské údaje: London Springer-Verlag 01.11.2010
    Vydané v Knowledge and information systems (01.11.2010)
    “…Web mining systems exploit the redundancy of data published on the Web to automatically extract information from existing Web documents…”
    Získať plný text
    Journal Article
  16. 16

    Towards extracting event-centric collections from Web archives Autor Gossen, Gerhard, Risse, Thomas, Demidova, Elena

    ISSN: 1432-5012, 1432-1300
    Vydavateľské údaje: Berlin/Heidelberg Springer Berlin Heidelberg 01.03.2020
    “…Web archives constitute an increasingly important source of information for computer scientists, humanities researchers and journalists interested in studying past events…”
    Získať plný text
    Journal Article
  17. 17

    NLP-based techniques for Cyber Threat Intelligence Autor Arazzi, Marco, R. Arikkat, Dincy, Nicolazzo, Serena, Nocera, Antonino, Rehiman K.A., Rafidha, P., Vinod, Conti, Mauro

    ISSN: 1574-0137
    Vydavateľské údaje: Elsevier Inc 01.11.2025
    Vydané v Computer science review (01.11.2025)
    “…In the digital era, threat actors employ sophisticated techniques for which, often, digital traces in the form of textual data are available…”
    Získať plný text
    Journal Article
  18. 18

    A Textual Content Analysis Model for Aligning Job Market Demands and University Curricula through Data Mining Techniques Autor Januzaj, Ylber A., Sylqa, Driton, Luma, Artan, Gashi, Luan

    ISSN: 1865-7923, 1865-7923
    Vydavateľské údaje: 02.08.2024
    “… Specifically, the integration of data mining techniques is employed for the automated extraction of relevant information from both labor market demands and university curricula…”
    Získať plný text
    Journal Article
  19. 19

    SiSOB data extraction and codification: A tool to analyze scientific careers Autor Geuna, Aldo, Kataishi, Rodrigo, Toselli, Manuel, Guzmán, Eduardo, Lawson, Cornelia, Fernandez-Zubieta, Ana, Barros, Beatriz

    ISSN: 0048-7333, 1873-7625
    Vydavateľské údaje: Amsterdam Elsevier B.V 01.11.2015
    Vydané v Research policy (01.11.2015)
    “…•The software provides data crawling and data mining techniques used to transform webpage-based information and CV information into a relational database…”
    Získať plný text
    Journal Article
  20. 20

    Predicting customer profitability during acquisition: Finding the optimal combination of data source and data mining technique Autor D’Haen, Jeroen, Van den Poel, Dirk, Thorleuchter, Dirk

    ISSN: 0957-4174, 1873-6793
    Vydavateľské údaje: Amsterdam Elsevier Ltd 01.05.2013
    Vydané v Expert systems with applications (01.05.2013)
    “… ► Commercially-available data is augmented by web data. ► Combining both web data and commercial data leads to the best predictive results for lead qualification…”
    Získať plný text
    Journal Article