Cao Rui
Science and Technology on Complex Electronic System Simulation Laboratory, Academy of Equipment, Beijing, 101416, China
Wang Rui
Department of automatic control and systems engineering, University of Sheffield, Sheffield, S1 4DT, UK
Hao Li-Yun
Science and Technology on Complex Electronic System Simulation Laboratory, Academy of Equipment, Beijing, 101416, China
Wu Ling-Da
Science and Technology on Complex Electronic System Simulation Laboratory, Academy of Equipment, Beijing, 101416, China
ABSTRACT
In order to enable the Web data to be applied in a non-Internet environment, overcoming the timeliness of Web data, this study proposes a domain Web data standardization organization method. The domain Web standardization organization framework is established and the acquired data are divided into three categories: Structured, un-structured and semi-structured. With respect to the semi-structured data within an implicit scheme, we use a rational code design to transform semi-structured data into structured data. Combining file system and relational database, a standardization organization method is established for the three types of data. Experimental results show that this method is effective and efficient.
PDF References Citation
How to cite this article
Cao Rui, Wang Rui, Hao Li-Yun and Wu Ling-Da, 2013. A Domain Web Data Standardization Organization Method. Information Technology Journal, 12: 6710-6716.
DOI: 10.3923/itj.2013.6710.6716
URL: https://scialert.net/abstract/?doi=itj.2013.6710.6716
DOI: 10.3923/itj.2013.6710.6716
URL: https://scialert.net/abstract/?doi=itj.2013.6710.6716
REFERENCES
- Bergman, M.K., 2001. White paper: The deep web: Surfacing hidden value. J. Electron. Publishing, Vol. 7.
CrossRef - He, B., M. Patel, Z. Zhang and C.K. Chang, 2007. Accessing the deep web: A survey. Commun. ACM, 50: 94-101.
Direct Link - Hicks, C., M. Scheffer, A.H.H. Ngu and Q.Z. Sheng, 2012. Discovery and cataloging of deep web sources. Proceedings of the 13th IEEE International Conference on Information Reuse and Integration, August 8-10, 2012, Las Vegas, NV., pp: 224-230.
CrossRef - Li, C.Q. and T.W. Ling, 2005. QED: A novel quaternary encoding to completely avoid re-labeling in XML updates. Proceedings of the 14th ACM International Conference on Information and Knowledge Management, 31 October-5 November, 2005, Bremen, Germany, pp: 501-508.
CrossRef - Marin-Castro, H.M., V.J. Sosa-Sosa and I. Lopez-Arevalo, 2011. A strategy for identification of web query interfaces using supervised learning. Proceedings of the 7th International Conference on Next Generation Web Services Practices, October 19-21, 2011, Salamanca, pp: 233-237.
CrossRef - Tripathy, A.K., N. Joshi, S. Thomas, S. Shetty and N. Thomas, 2012. VEDD-a visual wrapper for extraction of data using DOM tree. Proceedings of the International Conference on Communication, Information and Computing Technology, October 19-20, 2012, Mumbai, pp: 1-6.
CrossRef - Xu, L., Z.F. Bao and T.W. Ling, 2007. A dynamic labeling scheme using vectors. Proceedings of the 18th International Conference on Database and Expert Systems Applications, September 3-7, 2007, Regensburg, Germany, pp: 130-140.
CrossRef - Zhang, C., J. Naughton, D. DeWitt, Q. Luo and G. Lohman, 2001. On supporting containment queries in relational database management systems. Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, May 21-24, 2001, California, USA., pp: 425-436.
CrossRef