Google SiteMap 生成工具 (PHP & ASP)

时间:2007-02-14 19:08:43   来源:  作者:  点击:次  出处:技术无忧
关键字:Google SiteMap

前一段时间在研究Google的SiteMap,做好了BLOG的SiteMp,并为此找了很多的文章来看,但发现很多的技术文章都是只管杀不管埋的,总有些错误,而有的则是不断的复制和粘贴,原著出在哪里都不知道,漏洞百出,也没有详细的说明,弄得郁闷死了,幸好还有一些热心的兄台给予指点,总算是完整了,溶入一些找到的相关知识,贴在这里供有需要的朋友分享一下。
Google SiteMap简介:
  Google推出的Sitemap,是对原来robots.txt的扩展,它使用XML格式来记录整个网站的信息并供Google读取,使搜索引擎能更快更全面的收录网站的内容。Sitemap的作用就好像为网站提供了整站的RSS,而Google就是这些RSS的订阅者,只要网站有更新就会自动通知Google。这样一来,搜索引擎的收录由被动的Pull变成了主动的Push,辛苦的Google爬虫们终于可以松一口气了。
  不过就目前来说,Google Sitemap还不是一个能让每个站长都方便使用的东西。其要求的XML格式虽不是很复杂,但要手工制作还是需要费不少功夫。虽然Google提供了Sitemap自动生成器,但目前只有python语言的版本, 能用上的也是少数,这个在实践中是得到了证实的,我看了N天都是一头雾水。不过Google Sitemap是按照创作共用协议发布的,Sitemap生成器也是开源的,所以相关的工具也很快出现了,我在网上就曾找到针对L-BLOG(此工具你可以参阅本博客直接生成XML的Google SiteMap代码ASP 4 LBLOG版本一文)、O-BLOG、DVBBS自动生成Sitemap的工具,只要简单的修改,各位站长和Blogger们就可以方便的使用这项服务了。
  简单点说,就是你以XML的格式向Google提交一个站点地图,以后google就会根据这个地图,阶段性地抓取该地图指出的页面。抱怨google收入页面太少的朋友不妨一试。
  以下是我找到的两个版本(ASP和PHP版)的网站SiteMap.xml生成工具,我现在用的就是以下ASP版的,经过测试了,我网站缘聚杭州生成的SiteMap文件:http://www.coosuo.com/SiteMap.xml,而下面的PHP版的代码,由于不是很懂,都没有测试过,大家可以试试如有错误,只有敬请谅解了!

ASP版代码:

 程序代码
<%
Server.ScriptTimeout=50000
' sitemap_gen.asp
' A simple script to automatically produce sitemaps for a webserver, in the Google Sitemap Protocol (GSP)
' by Francesco Passantino
' www.iteam5.net/francesco/sitemap
' v0.2 released 5 june 2005 (Listing a directory tree recursively improvement)
'
' BSD 2.0 license,
' http://www.opensource.org/licenses/bsd-license.php
' 收集整理:        重庆森林@im286.com
' 部分修改:    独人向晚QQ19433114


session("server")="http://www.coosuo.com"                '你的域名
vDir = "/"                                               '制作SiteMap的目录,相对目录(相对于根目录而言)


set objfso = CreateObject("Scripting.FileSystemObject")
root = Server.MapPath(vDir)

'response.ContentType = "text/xml"
'response.write "<?xml version='1.0' encoding='UTF-8'?>"
'response.write "<urlset xmlns='http://www.google.com/schemas/sitemap/0.84'>"

str = "<?xml version='1.0' encoding='UTF-8'?>" & vbcrlf
str = str & "<urlset xmlns='http://www.google.com/schemas/sitemap/0.84'>" & vbcrlf

Set objFolder = objFSO.GetFolder(root)
'response.write getfilelink(objFolder.Path,objFolder.dateLastModified)
Set colFiles = objFolder.Files
For Each objFile In colFiles
        'response.write getfilelink(objFile.Path,objfile.dateLastModified)
        str = str & getfilelink(objFile.Path,objfile.dateLastModified) & vbcrlf
Next
ShowSubFolders(objFolder)

'response.write "</urlset>"
str = str & "</urlset>" & vbcrlf
set fso = nothing

Set objStream = Server.CreateObject("ADODB.Stream")
    With objStream
    '.Type = adTypeText
    '.Mode = adModeReadWrite
    .Open
    .Charset = "utf-8"
    .Position = objStream.Size
    .WriteText=str
    .SaveToFile server.mappath("/sitemap.xml"),2 '生成的XML文件名
    .Close
    End With

  Set objStream = Nothing
  If Not Err Then
    Response.Write("<script>alert('成功生成站点地图!');history.back();</script>")
    Response.End
  End If

Sub ShowSubFolders(objFolder)
        Set colFolders = objFolder.SubFolders
        For Each objSubFolder In colFolders
                if folderpermission(objSubFolder.Path) then
                        'response.write getfilelink(objSubFolder.Path,objSubFolder.dateLastModified)
                        str = str & getfilelink(objSubFolder.Path,objSubFolder.dateLastModified) & vbcrlf
                        Set colFiles = objSubFolder.Files
                        For Each objFile In colFiles
                                'response.write getfilelink(objFile.Path,objFile.dateLastModified)
                                str = str & getfilelink(objFile.Path,objFile.dateLastModified) & vbcrlf
                        Next
                        ShowSubFolders(objSubFolder)
                end if
        Next
End Sub


Function getfilelink(file,datafile)
        file=replace(file,root,"",1,-1,1)
        file=replace(file,"\","/")
        If FileExtensionIsBad(file) then Exit Function
        if month(datafile)<10 then filedatem="0"
        if day(datafile)<10 then filedated="0"
        filedate=year(datafile)&"-"&filedatem&month(datafile)&"-"&filedated&day(datafile)
        getfilelink = "<url><loc>"&server.htmlencode(session("server")&file)&"</loc><lastmod>"&filedate&"</lastmod><changefreq>daily</changefreq><priority>1.0</priority></url>"
        Response.Flush
End Function


Function Folderpermission(pathName)

        '需要过滤的目录(不列在SiteMap里面)
        PathExclusion=Array("\blog","\temp","\_vti_cnf","_vti_pvt","_vti_log","cgi-bin","\admin","\edu")
        Folderpermission =True
        for each PathExcluded in PathExclusion
                if instr(ucase(pathName),ucase(PathExcluded))>0 then
                        Folderpermission = False
                        exit for
                end if
        next
End Function


Function FileExtensionIsBad(sFileName)
        Dim sFileExtension, bFileExtensionIsValid, sFileExt
        'modify for your file extension (http://www.googleguide.com/file_type.html)
        Extensions = Array("asp","png","jpeg","zip","pdf","ps","html","htm","php","wk1","wk2","wk3","wk4","wk5","wki","wks","wku","lwp","mw","xls","ppt","doc","wks","wps","wdb","wri","rtf","ans","txt")
'设置列表的文件名,扩展名不在其中的话SiteMap则不会收录该扩展名的文件

        if len(trim(sFileName)) = 0 then
                FileExtensionIsBad = true
                Exit Function
        end if

        sFileExtension = right(sFileName, len(sFileName) - instrrev(sFileName, "."))
        bFileExtensionIsValid = false        'assume extension is bad
        for each sFileExt in extensions
                if ucase(sFileExt) = ucase(sFileExtension) then
                        bFileExtensionIsValid = True
                        exit for
                end if
        next
        FileExtensionIsBad = not bFileExtensionIsValid
End Function
%>

有关操作系统的更多文章请进:技术无忧


 1/2    1 2 ›› ›|

文章评论

共有 0 位网友发表了评论 此处只显示部分留言 点击查看完整评论页面