Have a personal or library account? Click to login
Optimized Focused Web Crawler with Natural Language Processing Based Relevance Measure in Bioinformatics Web Sources Cover

Optimized Focused Web Crawler with Natural Language Processing Based Relevance Measure in Bioinformatics Web Sources

Open Access
|Jun 2019

References

  1. 1. Mihalcea, R., P. Tarau. TextRank: Bringing Order into Texts. University of North Texas, UNT Digital Library, 2004.
  2. 2. Wan, Y., H. Tong. URL Assignment Algorithm of Crawler in Distributed System Based on Hash. – IEEE International Conference on Networking, Sensing and Control (ICNSC’2008), 2008, pp. 1632-1635.10.1109/ICNSC.2008.4525482
  3. 3. Jalilian, O., H. Khotanlou. A New Fuzzy-Based Method to Weigh the Related Concepts in Semantic Focused Web Crawlers. – IEEE, 2011, pp. 23-27.10.1109/ICCRD.2011.5764237
  4. 4. Mejdl, S., A. Althagafi., Dunren. Improving Relevance Prediction for Focused Web Crawlers. – In: 11th International Conference on Computer and Information Science, IEEE/ACIS,2012, pp.161-166.
  5. 5. Pavani, K., G. P. Sajeev. A Novel Web Crawling Method for Vertical Search Engines. IEEE, 2017, pp. 1488-1493.10.1109/ICACCI.2017.8126051
  6. 6. Lokhande, K. P., S. S. Honale, H. N. Gangavane. Web-Crawler Using Priority Queue. – International Journal of Research in Advent Technology, Vol. 2, 2014, No 2.
  7. 7. Dixit, A., et. al. URL Ordering Policies for Distributed Crawlers: A Review. 2015, arXiv preprint arXiv:1611.01228.
  8. 8. Chakrabarti, S., B. Dom, M. Vanden Berg. Focused Crawling – A New Approach to Topic Specific Web Resource Discovery. – Elsevier Science B.V. Computer Networks, Vol. 31, 1999, No 11, pp. 1623-1640.10.1016/S1389-1286(99)00052-3
  9. 9. Castillo, C. Effective Web Crawling. Ph. D. Thesis, University of Chile. Retrieved 2010-08-03, 2004.
  10. 10. Dubey, J., D. Singh. A Survey on Web Crawler. – International Journal of Electrical, Electronic and Computer System, Vol. 1, 2013, Issue 1. ISSN: 2347-2820.
  11. 11. Altschul, S. F., W. Gish, W. Miller, E. W. Meyers, D. J. Lipman. Basic Local Alignment Search Tool. – Journal of Molecular Biology, Vol. 215, 1990, No 3, pp. 403-410.10.1016/S0022-2836(05)80360-2
  12. 12. Anne, H., H. Ngu, D. Rocco. Terence Critchlow David Buttler, Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces. – Journal of World Wide Web, Vol. 8, 2005.10.1007/s11280-005-0509-5
  13. 13. Rocco, D., T. Critchlow. Automatic Discovery and Classification of Bioinformatics Web Sources. – Journal of Bioinformatics, Vol. 19, 2003.10.1093/bioinformatics/btg35314555625
  14. 14. Arasu, A., H. Garcia-Molina. Extracting Structured Data from Web Pages. – In: Proc. of ACM/SIGMOD Annual Conference on Management of Data, 2003, pp. 337-348.10.1145/872757.872799
  15. 15. Modica, G., A. Gal, H. M. Jamil. The Use of Machine-Generated Ontologies in Dynamic Information Seeking. – In: 9th International Conference on Cooperative Information Systems, CoopIS2001, 2001, pp. 433-448.10.1007/3-540-44751-2_32
  16. 16. Haas, L., P. Schwarz, P. Kodali, E. Kotlar, J. Rice, W. Swope. Discoverylink: A System for Integrating Life Sciences Data. – IBM Systems Journal, Vol. 40, 2001, No 2.10.1147/sj.402.0489
  17. 17. Davidson, S. B., G. C. Overton, V. Tannen, L. Wong. BioKleisli: A Digital Library for Biomedical Researchers. – International Journal on Digital Libraries, Vol. l, 1997, No l, pp. 36-53.10.1007/s007990050003
DOI: https://doi.org/10.2478/cait-2019-0021 | Journal eISSN: 1314-4081 | Journal ISSN: 1311-9702
Language: English
Page range: 146 - 158
Submitted on: Feb 22, 2018
Accepted on: Feb 14, 2019
Published on: Jun 18, 2019
Published by: Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2019 S. R. Mani Sekhar, G. M. Siddesh, Sunilkumar S. Manvi, K. G. Srinivasa, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.