XÂY DỰNG CÔNG CỤ NGĂN CHẶN VIỆC TRUY CẬP WEB ĐEN (HÌNH ẢNH, NỘI DUNG)
Abstract
Web filtering is used to prevent access to black web pages (web pages have desirable content or images). In this paper, we apply classification method with Support vector machine learning (SVM) to build a web filtering tool that is integrated with 2 filters: text filter – use text classification method and image filter – use image classification method. With two filters, this tool can prevent user access to desirable content web pages or remove desirable images when web page is displayed on the browser.
Tóm tắt
Article Details
Tài liệu tham khảo
Christopher D.Manning, Prabhakar Raghavan, Hinrich Schütze , 2009. An Introduction to Information Retrieval. CambridgeUniversityPress Cambridge, England: 117 – 119.
David G.Lowe, 2004. “Distinctive Image Features from Scale-Invariant Keypoints”. International Journal of Computer Vision: 91-110.
Jun-Yang, Yu-GangJiang, 1997. “Evaluating Bag-of-Visual-Words Representations in Scene Classification”. WOODSTOCK ’97 El Paso, Texas, USA: 2 - 3.
Thomas Deselaers, Lexi Pimenidis, Hermann Ney, . “Bag-of-Visual-Words Models for Adult Image Classification and Filtering”, 2008.19th International Conference on Pattern Recognition ICPR : 1 –4.
Rongbo Du, Reihaneh Safavi-Naini and Willy Susilo, 2003. “Web Filtering Using Text Classification”. The 11thIEEE International Conference on Networks: 325 –330.
W.Hu, H.Zuo, Ou Wu, Yunfei Chen, 2011. “Recognition of Adult Images, Videos, and Web Page Bags”. ACM Transactionson Multimedia Computing, Communications and Applications,Vol.7S,No.1: 1 – 28.
Mohamed Hammami, Youssef Chahir, Liming Chen, 2003. Combining Text and Image Analysis in the Web filtering System “WEBGUARD”. International Conference WWW/Internet: 611 – 618.
Youngsoo Kim, Taekyong Nam, Dongho Won, 2006. Text Classification for Harmful Web Document.Computational Science and Its Applications - ICCSA 2006: 545 – 551.
Yiming Yang, Jan O. Pedersen, 1997. A comparative Study on Feature Selection in Text Categorization. Proceedings of the 14thinternational conference on Machine Learning: 412 – 420.
Saikat Sen, 2010. Adult Website Classifier. CS229 Machine Learning Course Project, Stanford University, USA. http://cs229.stanford.edu/proj2010/Stanford project CS229.Saikat.Sen.pdf
I. Santos, P. Galán-García, A. Santamaría-Ibirika, B. Alonso-Isla, I. Alabau-Sarasola, and Pablo G. Bringas, 2012. Adult Content Filtering through Compression-based Text Classication. CISIS/ICEUTE/SOCO Special Sessions, volume 189 of Advances in Intelligent Systems and Computing: 281-288.
Feng Jiao, Wen Gao, Lijuan Duan, Guoqin Cui, 2011. Detecting Adult Image using Multiple Features. Info-tech and Info-net, 2001. Proceedings. ICII 2001 - Beijing. 2001 International Conferences on vol 3: 378 - 383.
Zhicheng Zhao, 2010. Combining multiple SVM classifiers for adult image recognition. Network Infrastructure and Digital Content, 2010 2nd IEEE International Conference on: 149 – 153.
Đỗ Thanh Nghị, 2012. Khai mỏ dữ liệu. NXB Đại học Cần Thơ.
Trần Cao Đệ, Phạm Nguyên Khang, 2012. Phân loại văn bản với Máy học vector hỗ trợ và Cây quyết định. Tạp chí Khoa học 2012 (21a): 52-63.
Nguyễn Thị Hoàn, 2010. Phương pháp trích chọn đặc trưng ảnh trong thuật toán Học máy tìm kiếm ảnh áp dụng trong bài toán tìm kiếm sản phẩm. Khóa luận Tốt nghiệp Đại học, Đại học Quốc gia Hà Nội: 13 – 20.
Ana P. B.Lopes, Sandra E.F.de Avila, Anderson N. A. Peixoto, RodrigoS.Oliveira, Marcelo de M. Coelho and Arnaldo de A. Araújo, 2009. Nude Detection in Video using Bag-of-Visual-Features. Computer Graphics and Image Processing (SIBGRAPI): 224 – 231.
Ana P. B.Lopes, Sandra E.F.de Avila, Anderson N. A. Peixoto, Rodrigo S. Oliveira, Marcelo de M. Coelho and Arnaldo de A. Araújo, 2009. A Bag-of-Features Approach Based on HUE-SIFT Descriptor for Nude Detection. In Proceedings of the 17th European Signal Processing Conference, Glasgow, Scotland, 2009.
Steel, C.M.S, 2012. The Mask-SIFT Cascading Classifier for Pornography Detection. Internet Security (WorldCIS): 139 – 142.
Bag of visual words model: recognizing object categories. http://www.robots.ox.ac.uk/~az/icvss08_az_bow.pdf
Rob Fergus, 2012. Recognition - Bag of words models http://cs.nyu.edu/~fergus/teaching/vision_2012/9_BoW.pdf
L. H.Phương, N. T.M Huyền và V.L.Xuân, 2010. VnTokenizer 4.1.1 – Tách từ tiếng Việt http://vlsp.vietlp.org:8080/demo/?page=resources
Chang, C.C., Lin, C.J, 2001. LIBSVM – a library for support vector machines http://www.csie.ntu.edu.tw/~cjlin/libsvm
David Lowe, 2005. SiftDemoV4 – SIFT Keypoint Detector http://www.cs.ubc.ca/~lowe/keypoints/
T-N.Do, 2011. Detection of Pornographic Images Using Bag-of-Visual-Words and Arcx4 of Random Multinomial Naïve Bayes. 4thInternational Conference on Theories and Applications of Computer Science, vol.49 of Journal of Science and Technology, Special Issue on Theories and Application of Computer Science: 13 – 24.