用python从A站抓取一个<div>放到自己的网页上,发现<div>变成<div,而且网页显示的<div>源代码。google说是unicode转译,求教怎么在网页中正常显示抓取的<div>。
|  |      1ciba1990 OP 新手求教。。。。。在线等ing | 
|  |      2wkdhf233      2015-07-10 00:53:28 +08:00 完全没看明白你在说啥 | 
|  |      3imlonghao      2015-07-10 00:55:06 +08:00 via Android No code no bb... | 
|  |      4ciba1990 OP @wkdhf233  就是我在a站抓取了一段<div>代码放到自己网页,<>在我网页源代码现实成<,而且不能正常显示网页。 | 
|  |      5Septembers      2015-07-10 00:57:12 +08:00  1 | 
|  |      6ciba1990 OP @imlonghao  <html> <head> </head> <body> <div class="searchResults" id="searchResults"> <h2>Web results</h2> <ul> <li> <h3><a href="https://www.python.org/" target="_blank">Welcome to Python.org</a></h3> <p class="url">https://www.python.org/<span class="date"> - 7 hours ago</span></p> <p>The official home of the Python Programming Language.</p> </li><li class="sameHostResult"> <h3><a href="https://www.python.org/downloads/" target="_blank">Download Python | Python.org</a></h3> <p class="url">https://www.python.org/downloads/</p> <p>... 2015-05-23 Download Release Notes <br> · Python 3.4.3 2015-02-25 Download ...</br></p> </li><li> <h3><a href="http://www.pyhton.org/" target="_blank">Wrong Page ?</a></h3> <p class="url">http://www.pyhton.org/</p> <p>If you were trying to reach Phyton website please copy and past the following <br> URL in your browser: http://www.phyton.org. YOU MAY HAVE GOTTEN HERE BY<br> ...</br></br></p> </li><li> <h3><a href="http://www.salome-platform.org/forum/forum_10/211874468" target="_blank">Creating geometry using <b>pyhton</b> code — SALOME Platform</a></h3> <p class="url">http://www.salome-platform.org/forum/forum_10/211874468</p> <p>Hello everyone!,. I'm almost new in salome; I build up a simple geometry (n <br> nodes and n-1 beams) using the salome gui. It took me a long time; then I <br> discovered ...</br></br></p> </li><li> <h3><a href="http://developers.gigya.com/display/GD/Pyhton+SDK+Change+Log" target="_blank"><b>Pyhton</b> SDK Change Log - Gigya Documentation - Developers Guide</a></h3> <p class="url">http://developers.gigya.com/display/GD/Pyhton+SDK+Change+Log</p> <p>Jun 10, 2015 <b>...</b> Version 2.17 - 26 Apr 2015. Bug fix regarding URL encoding. The Python SDK <br> now restores urllib handlers after completing requests to Gigya.</br></p> </li><li> <h3><a href="" target="_blank"><b>Pyhton</b> - You A Me LifeIine Full Promo Dancehall 2015 - YouTube</a></h3> <p class="url"></p> <p>Feb 16, 2015 <b>...</b> <b>Pyhton</b> - You A Me LifeIine ○Full Promo○ Dancehall 2015. IamDjChigga ... Up <br> Hot DJ Chigga <b>Pyhton</b> A Good Artists the Thing Loud...$$$$$.</br></p> </li><li class="sameHostResult"> <h3><a href="" target="_blank"><b>Pyhton</b> - Mommy Nah Worry No More Full Promo Dancehall 2015 <b>...</b></a></h3> <p class="url"></p> <p>Mar 20, 2015 <b>...</b> <b>Pyhton</b> - Mommy Nah Worry No More ○Full Promo○ Dancehall 2015. <br> IamDjChigga. SubscribeSubscribedUnsubscribe ...</br></p> </li><li> <h3><a href="https://www.thenewboston.com/forum/topic.php?id=6569" target="_blank"><b>Pyhton</b> GUI´s - thenewboston Forum</a></h3> <p class="url">https://www.thenewboston.com/forum/topic.php?id=6569</p> <p>May 2, 2015 <b>...</b> Can anyone recommend a good book( i.e. as in paper) to use as a reference <br> work with Python GUis. There are lots of excellent videos etc on ...</br></p> </li><li> <h3><a href="http://www.gamefaqs.com/psp/932978-metal-gear-solid-portable-ops/answers/189967-how-do-i-beat-pyhton" target="_blank">How do I beat <b>pyhton</b>? - Metal Gear Solid: Portable Ops Answers for <b>...</b></a></h3> <p class="url" title="http://www.gamefaqs.com/psp/932978-metal-gear-solid-portable-ops/answers/189967-how-do-i-beat-pyhton">http://www.gamefaqs.com/psp/932978-metal-gear-solid-portable-ops/answe...</p> <p>For Metal Gear Solid: Portable Ops on the PSP, a GameFAQs Answers question <br> titled "How do I beat <b>pyhton</b>?".</br></p> </li><li> <h3><a href="https://bugs.launchpad.net/bugs/1415067" target="_blank">Bug #1415067 “QtiPlot crashed when chossing <b>Pyhton</b> as default sc <b>...</b></a></h3> <p class="url">https://bugs.launchpad.net/bugs/1415067</p> <p>Jan 27, 2015 <b>...</b> I installed qtiplot and worked on it for a while. Changing the Default scripting <br> language to <b>Pyhton</b> in Preferences, I end with this problem.</br></p> </li> </ul> </div> </body> </html> | 
|  |      7imlonghao      2015-07-10 00:58:00 +08:00 via Android 爬虫代码 | 
|  |      8wkdhf233      2015-07-10 01:01:35 +08:00 @ciba1990 它转义了你给替换回来呗,连正则都不用。。 话说第一次见到采集连着html标签一起采的,你拿正则把关键内容切出来然后标签自己输出不就啥事没有了 | 
|  |      9ciba1990 OP @wkdhf233  正则怎么用, html=urllib2.urlopen(url).read() soup = BeautifulSoup(html) link = soup.find_all('div') mydiv=str(link[0]) 这是我爬虫代码,新手上路。 | 
|  |      10ciba1990 OP @imlonghao  html=urllib2.urlopen(url).read() soup = BeautifulSoup(html) link = soup.find_all('div') mydiv=str(link[0]) | 
|  |      11imlonghao      2015-07-10 01:10:07 +08:00 via Android import HTMLParser html_parser = HTMLParser.HTMLParser() s = html_parser.unescape(s) | 
|  |      12imlonghao      2015-07-10 01:10:35 +08:00 via Android 把mydiv带进去s的地方 | 
|  |      13ciba1990 OP | 
|  |      14icedx      2015-07-10 01:18:30 +08:00 via Android 模板被转义了吧 | 
|  |      16lcqtdwj      2015-07-10 01:26:08 +08:00  1 {% autoescape off %} {{ keyword}} {% endautoescape %} 查查文档,就是不要自动转义 | 
|      18sallowdish      2015-07-10 02:51:25 +08:00 要顯示code就放到<pre></pre>裏面,要顯示内容就turn off html escape | 
|  |      19imlonghao      2015-07-10 06:52:07 +08:00 via Android Django取消模板转义 | 
|  |      20loading      2015-07-10 08:01:44 +08:00 via Android flask有自动转,是安全考虑。 楼主但是说说你用了什么库! 基本代码都不贴,没人需要你的代码的,都想帮你。 开源的爬虫代码有很多的。 | 
|  |      21thinkmore      2015-07-10 09:52:33 +08:00 将抓取到的内容进行转义就行了,前后台均可 |