怎样快速从一个XML文件中查找信息

  在网络时代,XML文件起到了一个保存和传输数据的作用。Soap协议通过Xml交流信息,数据库通过Xml文件存取等等。那么怎样快速的从一个XML文件中取得所需的信息呢?

  我们知道,JAVA的JAXP中和Microsoft.Net都有Xml分析器,Microsoft.Net是边读边分析,而JAXP是读到内存中然后才进行分析(还有一种是事件机制去读),总而言之,是不利于快速读取。基于此,Microsoft.Net 和JAXP都提供了XPATH机制,来快速定位到XML文件中所需的节点。

  例如有一个XML文件:booksort.xml:

  <?xml version="1.0"?>

  <!-- a fragment of a book store inventory database -->

  <bookstore xmlns:bk="urn:samples">

  <book genre="novel" publicationdate="1997" bk:ISBN="1-861001-57-8">

  <title>Pride And Prejudice</title>

  <author>

  <first-name>Jane</first-name>

  <last-name>Austen</last-name>

  </author>

  <price>24.95</price>

  </book>

  <book genre="novel" publicationdate="1992" bk:ISBN="1-861002-30-1">

  <title>The Handmaid's Tale</title>

  <author>

  <first-name>Margaret</first-name>

  <last-name>Atwood</last-name>

  </author>

  <price>29.95</price>

  </book>

  <book genre="novel" publicationdate="1991" bk:ISBN="1-861001-57-6">

  <title>Emma</title>

  <author>

  <first-name>Jane</first-name>

  <last-name>Austen</last-name>

  </author>

  <price>19.95</price>

  </book>

  <book genre="novel" publicationdate="1982" bk:ISBN="1-861001-45-3">

  <title>Sense and Sensibility</title>

  <author>

  <first-name>Jane</first-name>

  <last-name>Austen</last-name>

  </author>

  <price>19.95</price>

  </book>

  </bookstore>

  如果我们想快速查找”last-name”等于”Austen”的所有标题名,可以通过以下方法可以得到:

  XmlReaderSample.cs

  //Corelib.net/System.Xml.Xsl/XPathDocument Class

  //Author :Any

  using System;

  using System.IO;

  using System.Xml;

  using System.Xml.XPath;

  public class XmlReaderSample

  {

  public static void Main()

  {

  XmlTextReader myxtreader = new XmlTextReader("booksort.xml");

  XmlReader myxreader = myxtreader;

  XPathDocument doc = new XPathDocument(myxreader);

  XPathNavigator nav = doc.CreateNavigator();

  XPathExpression expr;

  expr = nav.Compile("descendant::book[author/last-name='Austen']");

  //expr.AddSort("title", XmlSortOrder.Ascending, XmlCaseOrder.None, "", XmlDataType.Text);

  XPathNodeIterator iterator = nav.Select(expr);

  while (iterator.MoveNext())

  {

  XPathNavigator nav2 = iterator.Current;

  nav2.MoveToFirstChild();

  Console.WriteLine("Book title: {0}", nav2.Value);

  }

  }

  }

  运行这个程序,结果为:

  Book title: Pride And Prejudice

  Book title: Emma

  Book title: Sense and Sensibility

  可以看到查找正确。

  利用XPATH中的一些功能,也可以实现简单的排序和简单运算。如在数据库中经常要对数据进行汇总,就可用XPATH实现。

  如:

  order.xml

  <!--Represents a customer order-->

  <order>

  <book ISBN='10-861003-324'>

  <title>The Handmaid's Tale</title>

  <price>19.95</price>

  </book>

  <cd ISBN='2-3631-4'>

  <title>Americana</title>

  <price>16.95</price>

  </cd>

  </order>

  和:books.xml

  <?xml version="1.0"?>

  <!-- This file represents a fragment of a book store inventory database -->

  <bookstore>

  <book cc="dd" xmlns:bk="urn:sample" xmlns:ns="http://www.Any.com" genre="autobiography" publicationdate="1981" ISBN="1-861003-11-0">

  <title>The Autobiography of Benjamin Franklin</title>

  <ns:author>

  <first-name>Benjamin</first-name>

  <last-name>Franklin</last-name>

  </ns:author>

  <price>8.99</price>

  </book>

  <book genre="novel" publicationdate="1967" ISBN="0-201-63361-2">

  <title>The Confidence Man</title>

  <author>

  <first-name>Herman</first-name>

  <last-name>Melville</last-name>

  </author>

  <price>11.99</price>

  </book>

  <book genre="philosophy" publicationdate="1991" ISBN="1-861001-57-6">

  <title>The Gorgias</title>

  <author>

  <name>Plato</name>

  </author>

  <price>9.99</price>

  </book>

  </bookstore>

  我们可以对该XML文件中的price求和,以得到价格总数。

  Evaluate.cs

  //Corelib.net/System.Xml.Xsl/XPathNavigator Class

  //Author :Any

  using System;

  using System.IO;

  using System.Xml;

  using System.Xml.XPath;

  public class EvaluateSample

  {

  public static void Main()

  {

  EvaluateSample myEvaluateSample = new EvaluateSample();

  myEvaluateSample.test("books.xml");

  }

  public void test(String args)

  {

  try

  {

  //test Evaluate(String);

  XPathDocument myXPathDocument = new XPathDocument(args);

  XPathNavigator myXPathNavigator = myXPathDocument.CreateNavigator();

  Console.WriteLine(myXPathNavigator.Evaluate("sum(descendant::book/price)"));

  //testEvaluate(XPathExpression);

  XmlDocument doc = new XmlDocument();

  doc.Load("order.xml");

  XPathNavigator nav = doc.CreateNavigator();

  XPathExpression expr = nav.Compile("sum(//price/text())");

  Console.WriteLine(nav.Evaluate(expr));

  //testEvaluate(XPathExpression);

  XPathNodeIterator myXPathNodeIterator = nav.Select("descendant::book/title");

  expr = nav.Compile("sum(//price/text())");

  Console.WriteLine(nav.Evaluate(expr,myXPathNodeIterator));

  }

  catch (Exception e)

  {

  Console.WriteLine ("Exception: {0}", e.ToString());

  }

  }

  }

  运行这个程序,结果如下:

  30.97

  36.9

  36.9

  我们可以看到,30.97是books.xml中所有price值的总和,而36.9则是order.xml中所有price值的总和。通过XPAH不仅可以快速查找信息,而且还可以对信息进行一些基本的处理。