jsoup解析网络HTML页，基本的使用方法

本文主要是介绍jsoup解析网络HTML页，基本的使用方法，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

这两天因为获得网页上的数据而纠结，研究了Json、Jsoup两种获取数据的方法

今天总算小有结果，Jsoup的基本用法学会了，把我的总结发到这里，希望对正在学习android的同学有帮助，我也是个初学者，还在努力中，不废话，上代码，（注：我对android的专业术语理解的不是太透彻，有不足请指点，跪谢！）

package com.android.web;import java.io.BufferedInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.http.util.ByteArrayBuffer;
import org.apache.http.util.EncodingUtils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import android.app.Activity;
import android.os.Bundle;
import android.view.View;
import android.view.View.OnClickListener;
import android.widget.ListView;
import android.widget.SimpleAdapter;public class _GetWebResoureActivity extends Activity {Document doc;@Overridepublic void onCreate(Bundle savedInstanceState) {super.onCreate(savedInstanceState);setContentView(R.layout.main);findViewById(R.id.button1).setOnClickListener(new OnClickListener() {@Overridepublic void onClick(View v) {load();}});}protected void load() {try {/*** URL：需要解析的网址，延迟5秒钟*/doc = Jsoup.parse(new URL("http://www.pkushutong.com"), 5000);} catch (MalformedURLException e1) {e1.printStackTrace();} catch (IOException e1) {e1.printStackTrace();}/*** 通过键值对获取数据*/List<Map<String, String>> list = new ArrayList<Map<String, String>>();/*** Elements：获得html页class样式的标签，getElementsByClass(class样式名称)*/Elements es = doc.getElementsByClass("home-box-class");/*** 遍历html页的源码*/for (Element e : es) {Map<String, String> map = new HashMap<String, String>();/*** title：通过html里的标签，把a标签里的内容获取到*/map.put("title", e.getElementsByTag("p").text());/*** href：链接页的名称*/map.put("href", "http://www.pkushutong.com"+ e.getElementsByTag("a").attr("href"));list.add(map);}ListView listView = (ListView) findViewById(R.id.listView1);listView.setAdapter(new SimpleAdapter(this, list, android.R.layout.simple_list_item_2,new String[] { "title","href" }, new int[] {android.R.id.text1,android.R.id.text2}));}/*** @param urlString* @return*/public String getHtmlString(String urlString) {try {URL url = null;url = new URL(urlString);URLConnection ucon = null;ucon = url.openConnection();InputStream instr = null;instr = ucon.getInputStream();BufferedInputStream bis = new BufferedInputStream(instr);ByteArrayBuffer baf = new ByteArrayBuffer(500);int current = 0;while ((current = bis.read()) != -1) {baf.append((byte) current);}return EncodingUtils.getString(baf.toByteArray(), "gbk");} catch (Exception e) {return "";}}
}

这里的代码很简单，就是通过方法寻找标签来获取标签对应的内容

注：这里强调一下，需要导入jsoup-1.6.1.jar包，没有这个包无法运行程序

源码下载地址：http://download.csdn.net/detail/u013415353/8389865

这篇关于jsoup解析网络HTML页，基本的使用方法的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！