ASP.NET抓取网页内容的实现方法

  本文实例讲述了ASP.NET抓取网页内容的实现方法。分享给大家供大家参考。具体实现方法如下:

  一、ASP.NET 使用HttpWebRequest抓取网页内容

  

复制代码 代码如下:
/// <summary>方法一:比较推荐

  /// 用HttpWebRequest取得网页源码

  /// 对于带BOM的网页很有效,不管是什么编码都能正确识别

  /// </summary>

  /// <param name="url">网页地址" </param>

  /// <returns>返回网页源文件</returns>

  public static string GetHtmlSource2(string url)

  {

  //处理内容

  string html = "";

  HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);

  request.Accept = "*/*"; //接受任意文件

  request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)"; //

  request.AllowAutoRedirect = true;//是否允许302

  //request.CookieContainer = new CookieContainer();//cookie容器,

  request.Referer = url; //当前页面的引用

  HttpWebResponse response = (HttpWebResponse)request.GetResponse();

  Stream stream = response.GetResponseStream();

  StreamReader reader = new StreamReader(stream, Encoding.Default);

  html = reader.ReadToEnd();

  stream.Close();

  return html;

  }

  二、ASP.NET 使用 WebResponse 抓取网页内容

  

复制代码 代码如下:
public static string GetHttpData2(string Url)

  {

  string sException = null;

  string sRslt = null;

  WebResponse oWebRps = null;

  WebRequest oWebRqst = WebRequest.Create(Url);

  oWebRqst.Timeout = 50000;

  try

  {

  oWebRps = oWebRqst.GetResponse();

  }

  catch (WebException e)

  {

  sException = e.Message.ToString();

  }

  catch (Exception e)

  {

  sException = e.ToString();

  }

  finally

  {

  if (oWebRps != null)

  {

  StreamReader oStreamRd = new StreamReader(oWebRps.GetResponseStream(), Encoding.GetEncoding("utf-8"));

  sRslt = oStreamRd.ReadToEnd();

  oStreamRd.Close();

  oWebRps.Close();

  }

  }

  return sRslt;

  }

  希望本文所述对大家的C#程序设计有所帮助。