问BeautifulSoup获得标签描述信息python源码怎么写,, <dd
问BeautifulSoup获得标签描述信息python源码怎么写,, <dd isStop = "1" class='isStop' matchcode="201409066001" matchnumcn ="周六001" starttime = "1409994000000" endtime ="1409993820000" isattention = "0" hostname="北九州" guestname="福冈黄蜂" leagueid = "533" hostteamid = "46148" visitteamid = "12193" matchid="1000817" leagueName="J2联赛" class="league_533"style="display: none;" ishot="0" >pass</dd>
比如我想获取的是:
style="display: none;"
这个字段的none~如何获取呢?
上代码:
#! /usr/bin/env python# -*- coding: utf-8 -*-tag_content = """<dd isStop = "1" class='isStop' matchcode="201409066001" matchnumcn ="周六001" starttime = "1409994000000" endtime ="1409993820000" isattention = "0" hostname="北九州" guestname="福冈黄蜂" leagueid = "533" hostteamid = "46148" visitteamid = "12193" matchid="1000817" leagueName="J2联赛" class="league_533" style="display: none;" ishot="0">pass</dd>"""from bs4 import BeautifulSouptag_soup = BeautifulSoup(tag_content)style_str = tag_soup.dd["style"]print style_str.split(":")[1].lstrip()[:-1]
Beautiful Soup不能直接获得“none”,不过我们能容易地得到display: none;,然后用python很容易处理了。
用tag attrs["style"] 然后正则
1.如果python的cgi中能有专门获取html中style或者属性的方法最好,这style 既没有id name 也不是value。不知道能不能get出来
2,我的超级笨办法我的思路:把这一大块用'''包裹,之后,另开一个py文件,用open打开刚才的要检索的大块,用readlins()去读取那大块中的每一行,用正则匹配出 style=“dispaly:”,之后再用str的切片 切出来。
s = """ <dd isStop = "1" class='isStop' matchcode="201409066001" matchnumcn ="周六001" starttime = "1409994000000" endtime ="1409993820000" isattention = "0" hostname="北九州" guestname="福冈黄蜂" leagueid = "533" hostteamid = "46148" visitteamid = "12193" matchid="1000817" leagueName="J2联赛" class="league_533" style="display: none;" ishot="0" >pass</dd>"""from pyquery import PyQueryp = PyQuery(s)a=p("dd")print a.attr('style')print a.attr('hostname')
display: none;
北九州
编橙之家文章,
<dd isStop = "1" class='isStop' matchcode="201409066001" matchnumcn ="周六001" starttime = "1409994000000" endtime ="1409993820000" isattention = "0" hostname="北九州" guestname="福冈黄蜂" leagueid = "533" hostteamid = "46148" visitteamid = "12193" matchid="1000817" leagueName="J2联赛" class="league_533"style="display: none;" ishot="0" >pass</dd>
比如我想获取的是:
style="display: none;"
这个字段的none~如何获取呢?
上代码:
#! /usr/bin/env python# -*- coding: utf-8 -*-tag_content = """<dd isStop = "1" class='isStop' matchcode="201409066001" matchnumcn ="周六001" starttime = "1409994000000" endtime ="1409993820000" isattention = "0" hostname="北九州" guestname="福冈黄蜂" leagueid = "533" hostteamid = "46148" visitteamid = "12193" matchid="1000817" leagueName="J2联赛" class="league_533" style="display: none;" ishot="0">pass</dd>"""from bs4 import BeautifulSouptag_soup = BeautifulSoup(tag_content)style_str = tag_soup.dd["style"]print style_str.split(":")[1].lstrip()[:-1]
Beautiful Soup不能直接获得“none”,不过我们能容易地得到display: none;,然后用python很容易处理了。
用tag attrs["style"] 然后正则
1.如果python的cgi中能有专门获取html中style或者属性的方法最好,这style 既没有id name 也不是value。不知道能不能get出来
2,我的超级笨办法我的思路:把这一大块用'''包裹,之后,另开一个py文件,用open打开刚才的要检索的大块,用readlins()去读取那大块中的每一行,用正则匹配出 style=“dispaly:”,之后再用str的切片 切出来。
s = """ <dd isStop = "1" class='isStop' matchcode="201409066001" matchnumcn ="周六001" starttime = "1409994000000" endtime ="1409993820000" isattention = "0" hostname="北九州" guestname="福冈黄蜂" leagueid = "533" hostteamid = "46148" visitteamid = "12193" matchid="1000817" leagueName="J2联赛" class="league_533" style="display: none;" ishot="0" >pass</dd>"""from pyquery import PyQueryp = PyQuery(s)a=p("dd")print a.attr('style')print a.attr('hostname')
display: none;
北九州
编橙之家文章,
相关内容
- Python协程同步问题求助asyncio模块,pythonasyncio,在Window
- Python3.4 __init__.py中类导入问题求助,,项目的结构时li
- python2.7.6 requests模块提交中文验证码,,我的环境:win
- python websocket源码中e的作用及属性应该如何解读,pyth
- 想用python语言写个微信程序的后台需要用到什么?,py
- 请问如何让python使用1.9.0的numpy?,1.9.0numpy,OSX 10.9使用
- 求助python高手解决公钥解密时报错,求助python解密时报
- flask装饰器参数传递获取问题,flask装饰参数获取,因为
- Python中怎么调用字符串形式命名的函数呢?,python字符
- Python flask web开发书中怎么和react配合前后端分离?,
评论关闭