Unlock the potential of the world's biggest database. This practical book shows you how to build portals, construct search engines and other knowledge-based applications to mine the information you need from the Web. - Written by a developer for developers - A practical, hands-on approach - Illustrates how Java associated tools (XML, HTML) can be combined with database technology to display and manipulate Web-derived information more effectively. - Demonstrates how to build a structure browser, portal, meta-search engine and how to make 'Talking Pages'
Tony Loton, LOTONTech Ltd, Middlewich, UK Tony Loton launched LOTONtech as a vehicle for researching and developing innovative software solutions. He developed the WebDataKit: a Java 2 solution comprising an API and a Structured Query Language designed specifically for the automatic extraction of HTML and XML from web sources. Tony's early Java web mining ideas have been featured previously as a case study contribution to "Professional Java Data programming" (Wrox Press). This book takes the ideas much further, with brand new material.
Preface. About the Author. Acknowlegements. Surveying the Scene Language of the Web HTML and XML Parsing Data Filters and Structured Queries Building a Portal with Java Building a Search Engine with Java Mail Mining with Java Introduction to Text Mining Introduction of Data Mining Loose Ends and Looking Ahead Appendix A: Software Installation and Configuration Appendix B: Javadoc Extracts Appendix C: Earlier Versions of JAXP Appendix D: License and Copyright Statements Appendix E: Census 1891Data XML Appendix F: Share Price Cluster Data Appendix G: Glossary of Acronyms References Further Reading Index