(È˹¤ÖÇÄÜ)»ùÓÚLuceneÓëHeritrixµÄËÑË÷ÒýÇæ¹¹½¨(7)
ÒÔÏÂÊÇ×ÊÁϽéÉÜ,ÈçÐèÒªÍêÕûµÄÇë³äÖµÏÂÔØ.
1.ÎÞÐè×¢²áµÇ¼,Ö§¸¶ºó°´ÕÕÌáʾ²Ù×÷¼´¿É»ñÈ¡¸Ã×ÊÁÏ.
2.×ÊÁÏÒÔÍøÒ³½éÉܵÄΪ׼,ÏÂÔغ󲻻áÓÐˮӡ.½ö¹©Ñ§Ï°²Î¿¼Ö®ÓÃ.
ÃÜ »Ý ±£ °ïÖúÖÐÐÄ
1.ÎÞÐè×¢²áµÇ¼,Ö§¸¶ºó°´ÕÕÌáʾ²Ù×÷¼´¿É»ñÈ¡¸Ã×ÊÁÏ.
2.×ÊÁÏÒÔÍøÒ³½éÉܵÄΪ׼,ÏÂÔغ󲻻áÓÐˮӡ.½ö¹©Ñ§Ï°²Î¿¼Ö®ÓÃ.
ÃÜ »Ý ±£ °ïÖúÖÐÐÄ
×ÊÁϽéÉÜ£º
½øÐÐ×ÊԴץȡ¡£
(2) ÔÚ·Ö´ÊÄ£¿é£¬Ê¹ÓÃÖпÆÔº·Ö´ÊÆ÷¡¢JEºÍStandardAnalyzer·ÖÎöÆ÷£¬ÊµÏÖÁ˵ÄÖÐÎÄ·Ö´Ê£¬Äܹ»¸üÓÐЧµÄË÷ÒýÓÚÍøÒ³¡¢Ó°ÊÓ¼°Í¼Æ¬Ïà¹ØµÄÐÅÏ¢¡£
(3) ³ä·ÖʹÓÃÁË×Ô¼ºÉè¼ÆµÄϵͳ¿ò¼Ü£¬½«Êý¾ÝºÍÒµÎñ·ÖÀ룬Ìá¸ßϵͳµÄ¿ÉÀ©Õ¹ÐԺͲÙ×÷ÐÔ¡£
(4) ¿ª·¢ÁË»ùÓÚAjax¼¼ÊõµÄÓû§½Ó¿Ú×Óϵͳ£¬ÓÐЧ¡¢¿ì½ÝµÄÍê³ÉËÑË÷ÈÎÎñ¡£
ËäÈ»ÔÚËÑË÷½á¹ûµÄ¾«È·¶ÈºÍË÷ÒýµÄ´æ´¢·½Ê½ÉÏÓдý½øÒ»²½µÄϸ»¯ºÍÍêÉÆ£¬µ«»ù±¾ÊµÏÖÁ˱¾ÏµÍ³µÄÉè¼Æ˼ÏëºÍʵÏÖ·½·¨£¬Äܹ»¶Ô¾¹ýץȡµÄÍøÒ³ÉϽøÐÐÓÐЧµÄËÑË÷¡£¶øÇÒÓÉÓÚϵͳµÄÉè¼ÆºÍʵÏÖ¶¼²ÉÓÃÁËÃæÏò¶ÔÏóµÄ·½·¨£¬ÔÚϵͳµÄ¿É¼Ì³ÐÐÔºÍÖØÓÃÐÔ·½Ã涼ÓÐÀûÓÚ½«À´µÄÍêÉƺ͸Ľø¡£
ÓÉÓÚʱ¼ä½ôÆÈ£¬¶øÇÒÕû¸öϵͳº¸ÇµÄ·¶Î§ºÜ´ó£¬Éæ¼°µ½µÄ¼¼Êõϸ½ÚºÜ¶à£¬Óв¿·Öϸ½ÚÉϵÄʵÏÖ²ÉÓÃÁ˱Ƚϼòµ¥µÄ·½·¨£¬ÒÔ±ãÓÚÕû¸öϵͳµÄ˳ÀûʵÏÖ¡£Òò´Ë£¬»¹ÐèÒª½øÐÐÉîÈëµÄÑо¿£¬ÒÔÌá¸ßÕû¸öϵͳµÄÐÔÄÜ¡£
²Î ¿¼ ÎÄ Ï×
[1] Àî¸Õ,ËÎΰ,ÇñÕÜ.Õ÷·þAjax+Lucene¹¹½¨ËÑË÷ÒýÇæ.±±¾©:ÈËÃñÓʵç³ö°æÉç,2006.
[2] ÇñÕÜ,·ûÌÏÌÏ.¿ª·¢×Ô¼ºµÄËÑË÷ÒýÇæ-Lucene2.0+Heritrix.±±¾©:ÈËÃñÓʵç³ö°æÉç,2007.
[3] Ëï³Ð½Ü.»ùÓÚͳ¼ÆµÄÍøÒ³ÕýÎÄÐÅÏ¢³éÈ¡·½·¨µÄÑо¿.ÖÐÎÄÐÅϢѧ±¨,2004,18(5):17-22. [×ÊÁÏÀ´Ô´£ºhttps://www.doc163.com]
[4] ÆÑÓî´ï,¹ØÒã,ÍõÇ¿.»ùÓÚÊý¾ÝÍÚ¾ò˼ÏëµÄÍøÒ³ÕýÎijéÈ¡·½·¨µÄÑо¿.µÚÈý½ìѧÉú¼ÆËãÓïÑÔѧÑÐÌÖ»áÂÛÎļ¯,ÉòÑô,2006:246-250.
[5] ÖìÃ÷.Êý¾ÝÍÚ¾ò.ºÏ·Ê:Öйú¿Æѧ¼¼Êõ´óѧ³ö°æÉç,2002.
[6] Hu Y H, Li H, Cao Y B et al. Automatic extraction of titles from general documents using machine learning. Information Processing and Management, 2005, 42 (5):1276-1293.
[7] ÍõµÂ·å,À.ËÑË÷ÒýÇæGoogleµÄÌåϵ½á¹¹¼°ÆäºËÐļ¼ÊõÑо¿.¹þ¶û±õÉÌÒµ´óѧѧ±¨(×ÔÈ»¿Æѧ°æ),2006,(01).
[8] ÂÀƽ.»ùÓÚAjaxµÄIPÍøÂç¹ÜÀíϵͳµÄÑо¿ÓëʵÏÖ:(˶ʿѧλÂÛÎÄ).±±¾©:±±¾©½»Í¨´óѧ, 2007.
[9] ³Â±ø¹ú.»ùÓÚAJAXµÄÍøÕ¾Éè¼ÆÓëʵÏÖ.¸£½¨µçÄÔ,2007,(12).
[10] ÕÅУǬ,½ðÓñÁá,ºîÀö²¨.Ò»ÖÖ»ùÓÚLucene¼ìË÷ÒýÇæµÄÈ«ÎÄÊý¾Ý¿âµÄÑо¿ÓëʵÏÖ.ÏÖ´úͼÊéÇ鱨¼¼Êõ,2005.
[11] Ñ¶«.»ùÓÚAjax¼¼ÊõµÄÒì²½ËÑË÷ÒýÇæÑо¿ÓëʵÏÖ:(˶ʿѧλÂÛÎÄ).ÄϾ©ÐÅÏ¢¹¤³Ì´óѧ,2007.
[12] ÖÜÕä¾ê,ÕÅ×Öƽ,½Áá.»ùÓÚLucene2.0µÄµç×ÓÎÄÏ×È«ÎļìË÷ϵͳ.µçÄÔ֪ʶÓë¼¼Êõ(ѧÊõ½»Á÷),2007.
[13] ÖìÓÀÊ¢,Îä¸Ûɽ.»ùÓÚWebµÄÐÂÎÅÐÅÏ¢³éÈ¡.¼ÆËã»ú¹¤³Ì,2006,32(10):74-76.
[14] ÁõǨ,¼Ö»Ý²¨.ÖÐÎÄÐÅÏ¢´¦ÀíÖÐ×Ô¶¯·Ö´Ê¼¼ÊõµÄÑо¿ÓëÕ¹Íû.¼ÆËã»ú¹¤³ÌÓëÓ¦ÓÃ,2006. [À´Ô´£ºhttp://Doc163.com]
[15] ¶¡³Ð.»ùÓÚ×Ö±íµÄÖÐÎÄËÑË÷ÒýÇæ·Ö´ÊϵͳµÄÉè¼ÆÓëʵÏÖ.¼ÆËã»ú¹¤³Ì,2001.
Ö л
±ÏÒµÉè¼ÆµÄÕâ¶Îʱ¼äÀÎÞÂÛÊÇÔÚѧϰÉÏ»¹ÊÇÉú»îÉÏÎÒ¶¼ÊÕ»ñÁ˺ܶ࣬ÕâÀë²»¿ªÀÏʦºÍͬѧÃǵĹØÐĺͰïÖú¡£
Ê×ÏÈ£¬ÖÔÐĸÐлÎҵĵ¼Ê¦ÕÔ¾§Ó¨ÀÏʦ¡£´ÓÑ¡Ìâ¡¢¿ª·¢Ö±µ½ÂÛÎĵÄÍê³É£¬ÕÔÀÏʦ¶¼¸øÓèÎÒ¾«ÐĵÄÖ¸µ¼£¬Ê¹ÎÒÔÚËÑË÷ÒýÇæ·ÖÎö¿ª·¢ÉÏÓÐÁ˳¤×ãµÄ½ø²½¡£ÕÔÀÏʦÈÈÇéµÄΪÈË¡¢ÑϽ÷µÄÖÎѧ̬¶ÈÒÔ¼°Ç¿ÁÒµÄÔðÈÎÐÄ£¬¶¼ÊÇÎÒѧϰµÄ¿¬Ä££¬¶ÔÓÚÎÒµÄѧϰºÍ½«À´µÄ¹¤×÷Éú»î¶¼½«ÊÇÒ»±Ê±¦¹óµÄ²Æ¸»¡£ÔÚ´ËÏòÕÔÀÏʦ±íʾ³ÏÖ¿µÄлÒ⣡
Æä´Î£¬Òª¸ÐлÎÒµÄͬѧÃÇ£¬ÔÚÎÒµÄÉú»îÉ϶ÔÎÒµÄÕչˣ¬Ñ§Ï°É϶ÔÎҵİïÖú£¬ËûÃÇÿ¸öÈ˵ÄÉíÉ϶¼ÓÐÖµµÃÎÒѧϰµÄÓŵ㣬ÔÚÕâÀïÓÈÆäÒª¸Ðл¬Ïþΰ¡¢º«¼Î¡¢ÂíÓÀ·É¡¢Àî骵Èͬѧ£¬ºÍËûÃǶɹýµÄÿһ´ç¿ªÐÄʱ¹â¶¼»á³ÉΪÎÒÒ»ÉúµÄÃÀºÃ»ØÒ䣬ԸËûÃǶ¼ÄÜÓµÓÐÒ»¸öÃÀÀöµÄÈËÉú¡£
×îºóÒª¸ÐлÎҵĸ¸Ä¸£¬ÊÇËûÃÇÔÚÎÒÉíºóÒ»Ö±Ö§³ÖÎÒ£¬¸øÎÒ°®Óë¹Ø»³£¬ÔÚÎÒÓöµ½À§ÄѵÄʱºò¹ÄÀøÎÒ£¬ÔÚÎÒÓÐÊÕ»ñʱΪÎÒ¸ßÐË¡£ËûÃÇÊÇÎÒÉú»îµÄ¶¯Á¦¡¢Ò»ÉúµÄ²Æ¸»£¬Ô¸ËûÃÇƽ°²¡¢½¡¿µ¡¢ÐÒ¸££¡
(2) ÔÚ·Ö´ÊÄ£¿é£¬Ê¹ÓÃÖпÆÔº·Ö´ÊÆ÷¡¢JEºÍStandardAnalyzer·ÖÎöÆ÷£¬ÊµÏÖÁ˵ÄÖÐÎÄ·Ö´Ê£¬Äܹ»¸üÓÐЧµÄË÷ÒýÓÚÍøÒ³¡¢Ó°ÊÓ¼°Í¼Æ¬Ïà¹ØµÄÐÅÏ¢¡£
(3) ³ä·ÖʹÓÃÁË×Ô¼ºÉè¼ÆµÄϵͳ¿ò¼Ü£¬½«Êý¾ÝºÍÒµÎñ·ÖÀ룬Ìá¸ßϵͳµÄ¿ÉÀ©Õ¹ÐԺͲÙ×÷ÐÔ¡£
(4) ¿ª·¢ÁË»ùÓÚAjax¼¼ÊõµÄÓû§½Ó¿Ú×Óϵͳ£¬ÓÐЧ¡¢¿ì½ÝµÄÍê³ÉËÑË÷ÈÎÎñ¡£
ËäÈ»ÔÚËÑË÷½á¹ûµÄ¾«È·¶ÈºÍË÷ÒýµÄ´æ´¢·½Ê½ÉÏÓдý½øÒ»²½µÄϸ»¯ºÍÍêÉÆ£¬µ«»ù±¾ÊµÏÖÁ˱¾ÏµÍ³µÄÉè¼Æ˼ÏëºÍʵÏÖ·½·¨£¬Äܹ»¶Ô¾¹ýץȡµÄÍøÒ³ÉϽøÐÐÓÐЧµÄËÑË÷¡£¶øÇÒÓÉÓÚϵͳµÄÉè¼ÆºÍʵÏÖ¶¼²ÉÓÃÁËÃæÏò¶ÔÏóµÄ·½·¨£¬ÔÚϵͳµÄ¿É¼Ì³ÐÐÔºÍÖØÓÃÐÔ·½Ã涼ÓÐÀûÓÚ½«À´µÄÍêÉƺ͸Ľø¡£
ÓÉÓÚʱ¼ä½ôÆÈ£¬¶øÇÒÕû¸öϵͳº¸ÇµÄ·¶Î§ºÜ´ó£¬Éæ¼°µ½µÄ¼¼Êõϸ½ÚºÜ¶à£¬Óв¿·Öϸ½ÚÉϵÄʵÏÖ²ÉÓÃÁ˱Ƚϼòµ¥µÄ·½·¨£¬ÒÔ±ãÓÚÕû¸öϵͳµÄ˳ÀûʵÏÖ¡£Òò´Ë£¬»¹ÐèÒª½øÐÐÉîÈëµÄÑо¿£¬ÒÔÌá¸ßÕû¸öϵͳµÄÐÔÄÜ¡£
²Î ¿¼ ÎÄ Ï×
[1] Àî¸Õ,ËÎΰ,ÇñÕÜ.Õ÷·þAjax+Lucene¹¹½¨ËÑË÷ÒýÇæ.±±¾©:ÈËÃñÓʵç³ö°æÉç,2006.
[2] ÇñÕÜ,·ûÌÏÌÏ.¿ª·¢×Ô¼ºµÄËÑË÷ÒýÇæ-Lucene2.0+Heritrix.±±¾©:ÈËÃñÓʵç³ö°æÉç,2007.
[3] Ëï³Ð½Ü.»ùÓÚͳ¼ÆµÄÍøÒ³ÕýÎÄÐÅÏ¢³éÈ¡·½·¨µÄÑо¿.ÖÐÎÄÐÅϢѧ±¨,2004,18(5):17-22. [×ÊÁÏÀ´Ô´£ºhttps://www.doc163.com]
[4] ÆÑÓî´ï,¹ØÒã,ÍõÇ¿.»ùÓÚÊý¾ÝÍÚ¾ò˼ÏëµÄÍøÒ³ÕýÎijéÈ¡·½·¨µÄÑо¿.µÚÈý½ìѧÉú¼ÆËãÓïÑÔѧÑÐÌÖ»áÂÛÎļ¯,ÉòÑô,2006:246-250.
[5] ÖìÃ÷.Êý¾ÝÍÚ¾ò.ºÏ·Ê:Öйú¿Æѧ¼¼Êõ´óѧ³ö°æÉç,2002.
[6] Hu Y H, Li H, Cao Y B et al. Automatic extraction of titles from general documents using machine learning. Information Processing and Management, 2005, 42 (5):1276-1293.
[7] ÍõµÂ·å,À.ËÑË÷ÒýÇæGoogleµÄÌåϵ½á¹¹¼°ÆäºËÐļ¼ÊõÑо¿.¹þ¶û±õÉÌÒµ´óѧѧ±¨(×ÔÈ»¿Æѧ°æ),2006,(01).
[8] ÂÀƽ.»ùÓÚAjaxµÄIPÍøÂç¹ÜÀíϵͳµÄÑо¿ÓëʵÏÖ:(˶ʿѧλÂÛÎÄ).±±¾©:±±¾©½»Í¨´óѧ, 2007.
[9] ³Â±ø¹ú.»ùÓÚAJAXµÄÍøÕ¾Éè¼ÆÓëʵÏÖ.¸£½¨µçÄÔ,2007,(12).
[10] ÕÅУǬ,½ðÓñÁá,ºîÀö²¨.Ò»ÖÖ»ùÓÚLucene¼ìË÷ÒýÇæµÄÈ«ÎÄÊý¾Ý¿âµÄÑо¿ÓëʵÏÖ.ÏÖ´úͼÊéÇ鱨¼¼Êõ,2005.
[11] Ñ¶«.»ùÓÚAjax¼¼ÊõµÄÒì²½ËÑË÷ÒýÇæÑо¿ÓëʵÏÖ:(˶ʿѧλÂÛÎÄ).ÄϾ©ÐÅÏ¢¹¤³Ì´óѧ,2007.
[12] ÖÜÕä¾ê,ÕÅ×Öƽ,½Áá.»ùÓÚLucene2.0µÄµç×ÓÎÄÏ×È«ÎļìË÷ϵͳ.µçÄÔ֪ʶÓë¼¼Êõ(ѧÊõ½»Á÷),2007.
[13] ÖìÓÀÊ¢,Îä¸Ûɽ.»ùÓÚWebµÄÐÂÎÅÐÅÏ¢³éÈ¡.¼ÆËã»ú¹¤³Ì,2006,32(10):74-76.
[14] ÁõǨ,¼Ö»Ý²¨.ÖÐÎÄÐÅÏ¢´¦ÀíÖÐ×Ô¶¯·Ö´Ê¼¼ÊõµÄÑо¿ÓëÕ¹Íû.¼ÆËã»ú¹¤³ÌÓëÓ¦ÓÃ,2006. [À´Ô´£ºhttp://Doc163.com]
[15] ¶¡³Ð.»ùÓÚ×Ö±íµÄÖÐÎÄËÑË÷ÒýÇæ·Ö´ÊϵͳµÄÉè¼ÆÓëʵÏÖ.¼ÆËã»ú¹¤³Ì,2001.
Ö л
±ÏÒµÉè¼ÆµÄÕâ¶Îʱ¼äÀÎÞÂÛÊÇÔÚѧϰÉÏ»¹ÊÇÉú»îÉÏÎÒ¶¼ÊÕ»ñÁ˺ܶ࣬ÕâÀë²»¿ªÀÏʦºÍͬѧÃǵĹØÐĺͰïÖú¡£
Ê×ÏÈ£¬ÖÔÐĸÐлÎҵĵ¼Ê¦ÕÔ¾§Ó¨ÀÏʦ¡£´ÓÑ¡Ìâ¡¢¿ª·¢Ö±µ½ÂÛÎĵÄÍê³É£¬ÕÔÀÏʦ¶¼¸øÓèÎÒ¾«ÐĵÄÖ¸µ¼£¬Ê¹ÎÒÔÚËÑË÷ÒýÇæ·ÖÎö¿ª·¢ÉÏÓÐÁ˳¤×ãµÄ½ø²½¡£ÕÔÀÏʦÈÈÇéµÄΪÈË¡¢ÑϽ÷µÄÖÎѧ̬¶ÈÒÔ¼°Ç¿ÁÒµÄÔðÈÎÐÄ£¬¶¼ÊÇÎÒѧϰµÄ¿¬Ä££¬¶ÔÓÚÎÒµÄѧϰºÍ½«À´µÄ¹¤×÷Éú»î¶¼½«ÊÇÒ»±Ê±¦¹óµÄ²Æ¸»¡£ÔÚ´ËÏòÕÔÀÏʦ±íʾ³ÏÖ¿µÄлÒ⣡
Æä´Î£¬Òª¸ÐлÎÒµÄͬѧÃÇ£¬ÔÚÎÒµÄÉú»îÉ϶ÔÎÒµÄÕչˣ¬Ñ§Ï°É϶ÔÎҵİïÖú£¬ËûÃÇÿ¸öÈ˵ÄÉíÉ϶¼ÓÐÖµµÃÎÒѧϰµÄÓŵ㣬ÔÚÕâÀïÓÈÆäÒª¸Ðл¬Ïþΰ¡¢º«¼Î¡¢ÂíÓÀ·É¡¢Àî骵Èͬѧ£¬ºÍËûÃǶɹýµÄÿһ´ç¿ªÐÄʱ¹â¶¼»á³ÉΪÎÒÒ»ÉúµÄÃÀºÃ»ØÒ䣬ԸËûÃǶ¼ÄÜÓµÓÐÒ»¸öÃÀÀöµÄÈËÉú¡£
×îºóÒª¸ÐлÎҵĸ¸Ä¸£¬ÊÇËûÃÇÔÚÎÒÉíºóÒ»Ö±Ö§³ÖÎÒ£¬¸øÎÒ°®Óë¹Ø»³£¬ÔÚÎÒÓöµ½À§ÄѵÄʱºò¹ÄÀøÎÒ£¬ÔÚÎÒÓÐÊÕ»ñʱΪÎÒ¸ßÐË¡£ËûÃÇÊÇÎÒÉú»îµÄ¶¯Á¦¡¢Ò»ÉúµÄ²Æ¸»£¬Ô¸ËûÃÇƽ°²¡¢½¡¿µ¡¢ÐÒ¸££¡
[×ÊÁÏÀ´Ô´£ºhttp://Doc163.com]