当前位置: 首页 > news >正文

怎么用ps做网站上的产品图广州好蜘蛛网站建设

怎么用ps做网站上的产品图,广州好蜘蛛网站建设,网站开发模块学些什么软件,免费自己建立网站爬取动态网页#xff08;下#xff09; 文章目录 爬取动态网页#xff08;下#xff09;前言一、大致内容二、基本思路三、代码编写1.引入库2.加载网页数据3.获取并保存4.保存文档 总结 前言 上篇主要讲了如何去爬取数据#xff0c;这篇来讲一下如何在获取的同时将数据整…爬取动态网页下 文章目录 爬取动态网页下前言一、大致内容二、基本思路三、代码编写1.引入库2.加载网页数据3.获取并保存4.保存文档 总结 前言 上篇主要讲了如何去爬取数据这篇来讲一下如何在获取的同时将数据整理保存到excel文档中。 上一篇《Python 爬虫之简单的爬虫三》链接https://blog.csdn.net/weixin_57061292/article/details/135073002 一、大致内容 以上一篇文章为基础。在原来的代码上进行增添和修改。 增添的内容是Python操作文档的一些库等相关代码。 修改的内容是对上一篇的《3.获取指定数据》进行修改遍历获取的数据的同时把它们添加到新创建的excel文档里。 运行效果图 二、基本思路 接着上一篇的基本思路继续写 第五步导入一下需要的新的软件库第六步主要是将上一篇《3.获取指定数据》里面print替换成将数据保存到文档中的操作。第七步删除文档中默认的Sheet工作表并保存文档。 三、代码编写 1.引入库 代码如下 # 以上是原来的 from selenium import webdriver from selenium.webdriver.common.by import By import time# 以下是新添加的 from openpyxl.styles import Font, Alignment, Border, Side import openpyxl import re2.加载网页数据 代码如下 # 这些是原来的 driver webdriver.Firefox() driver.get(https://movie.douban.com/annual/2022/?fullscreen1sourcemovie_navigation) time.sleep(5) driver.execute_script(window.scrollTo(0, document.body.scrollHeight);)# 这些是新添加的 # 创建实例对象 wb openpyxl.Workbook()这里新添加一个对象实例用来生成excel文档用的。 3.获取并保存 代码如下 # 获取四大影视类型标题 comment_Titles driver.find_elements(byBy.CSS_SELECTOR, value.module-top10-grid-chart-title) # 创建以四大影视类型标题的四个工作表 i 0 for comment in comment_Titles:# 创建工作表ws wb.create_sheet(indexi, titlecomment.text)# 冻结首行ws.freeze_panes A2# 首行居中、加粗、加框线# 将电影中的元素作为标题添加到每个工作表的第一行中cell_titles [片名, 演员, 评分, 产地]index 1for title in cell_titles:wc ws.cell(row1, columnindex, valuetitle)# 加粗wc.font Font(boldTrue)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))# 水平垂直居中wc.alignment Alignment(horizontalcenter, verticalcenter)index 1i 1# 获取每个影视类型里的第一名片名 which_mo_list driver.find_elements(byBy.CSS_SELECTOR, value.subject-top-title) # 将第一名的片名写入到每个工作表中 a 0 for each_mo in which_mo_list:movie_title each_mo.get_attribute(title)if a 0:ws wb[评分最高华语电影]wc ws.cell(column1, row2, valuef《{movie_title}》)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif a 1:ws wb[评分最高外语电影]wc ws.cell(column1, row2, valuef《{movie_title}》)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif a 2:ws wb[年度冷门佳片]wc ws.cell(column1, row2, valuef《{movie_title}》)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif a 3:ws wb[华语剧集]wc ws.cell(column1, row2, valuef《{movie_title}》)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))a 1# 获取每个影视类型里的第一名评分 movies_top_scores_list driver.find_elements(byBy.CSS_SELECTOR, value.rating-card-value) # 将第一名的评分写入到每个工作表中 c 0 for movie_top_score in movies_top_scores_list:score movie_top_score.textif c 0:ws wb[评分最高华语电影]wc ws.cell(column3, row2, valuescore)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif c 1:ws wb[评分最高外语电影]wc ws.cell(column3, row2, valuescore)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif c 2:ws wb[年度冷门佳片]wc ws.cell(column3, row2, valuescore)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif c 3:ws wb[华语剧集]wc ws.cell(column3, row2, valuescore)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))c 1# 获取所有影片的人物信息 persons_list driver.find_elements(byBy.CSS_SELECTOR, value.subject-credit) # 将演员信息添加到各自的工作表中 b 0 for person in persons_list:person_title person.find_elements(byBy.TAG_NAME, valuep)for title in person_title:# 演员信息actor title.textif 0 b 10:ws wb[评分最高华语电影]wc ws.cell(column2, rowb1, valueactor)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 11 b 21:ws wb[评分最高外语电影]wc ws.cell(column2, rowb-10, valueactor)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 22 b 32:ws wb[年度冷门佳片]wc ws.cell(column2, rowb-21, valueactor)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 33 b 43:ws wb[华语剧集]wc ws.cell(column2, rowb-32, valueactor)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))b 1# 获取所有影片的片名每个影视类型里的第一名除外 movies_title_list driver.find_elements(byBy.CSS_SELECTOR, value.subjects-rank-title) # 将片名写入到每个工作表中 d 0 for movie_title in movies_title_list:# 使用正则表达式提取中文文本# 使用正则表达式 [\u4e00-\u9fff]# 匹配一个或多个连续的中文字符并使用 re.search().group(1) 获取第一个括号内的匹配内容即中文文本。chinese_text re.search(r([\u4e00-\u9fff]), movie_title.text).group(1)if 0 d 8:ws wb[评分最高华语电影]wc ws.cell(column1, rowd3, valuef《{chinese_text}》)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 9 d 17:ws wb[评分最高外语电影]wc ws.cell(column1, rowd-6, valuef《{chinese_text}》)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 18 d 26:ws wb[年度冷门佳片]wc ws.cell(column1, rowd-15, valuef《{chinese_text}》)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 27 d 35:ws wb[华语剧集]wc ws.cell(column1, rowd-24, valuef《{chinese_text}》)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))d 1# 获取影片的产地每个影视类型里的第一名除外 addresses_list driver.find_elements(byBy.CSS_SELECTOR, value.subjects-rank-credits div:nth-child(2)) # 将产地名称添加到每个工作表中 e 0 for addresses in addresses_list:address_text addresses.textif 0 e 8:ws wb[评分最高华语电影]wc ws.cell(column4, rowe 3, valueaddress_text)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 9 e 17:ws wb[评分最高外语电影]wc ws.cell(column4, rowe - 6, valueaddress_text)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 18 e 26:ws wb[年度冷门佳片]wc ws.cell(column4, rowe - 15, valueaddress_text)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 27 e 35:ws wb[华语剧集]wc ws.cell(column4, rowe - 24, valueaddress_text)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))e 1# 获取影片评分每个影视类型里的第一名除外 movies_scores_list driver.find_elements(byBy.CSS_SELECTOR, value.subjects-rank-rating) # 将评分输入到每个工作表中 f 0 for movie_score in movies_scores_list:score movie_score.textif 0 f 8:ws wb[评分最高华语电影]wc ws.cell(column3, rowf 3, valuescore)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 9 f 17:ws wb[评分最高外语电影]wc ws.cell(column3, rowf - 6, valuescore)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 18 f 26:ws wb[年度冷门佳片]wc ws.cell(column3, rowf - 15, valuescore)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))elif 27 f 35:ws wb[华语剧集]wc ws.cell(column3, rowf - 24, valuescore)# 单元格左右上下加框线wc.border Border(leftSide(border_stylethin), rightSide(border_stylethin),topSide(border_stylethin), bottomSide(border_stylethin))f 1代码很多哈。但都是有规律的。上一篇是获取到数据把它变成一个列表然后遍历打印出来它。 这里变了。不是遍历打印了改成遍历保存了。因为上面获取的每个列表里面的元素顺序是有规律的需要大家自己动手去体会啦结合一定的逻辑判断分别把它们填写到四个类型的工作表中去再添加一些对表格美化的操作的代码。 4.保存文档 代码如下 del wb[Sheet] wb.save(fexample{int(time.time())}.xlsx)删除文档默认的Sheet工作表没卵用保存文档默认保存到当前文件夹下。 总结 其它的还好主要是数据的遍历保存的逻辑判断部分的代码这个需要大家手动去搞一遍才能明白。这篇用的是Python 3.11.6 版本的环境基本环境因素要注意哦要不然就算一样的代码运行起来也可能会有问题。
http://www.pierceye.com/news/647402/

相关文章:

  • 葫芦岛做网站公司如皋网站开发公司
  • 国外开源 企业网站服务好质量好的网站制作
  • sql网站的发布流程品牌建设是什么意思
  • 营口网站建设价格江苏住房和建设厅网站
  • 网站稳定性不好的原因打金新开传奇网站
  • 做网站怎么上传图片厦门建站网址费用
  • 网站设计方案和技巧做设计有必要买素材网站会员吗
  • 成都制作网站软件网站别人帮做的要注意什么东西
  • 徐州建筑网站建网站要自己买服务器吗
  • 网站订单系统模板专业的做网站公司
  • 怎么做加盟美容院网站黄骅港开发区
  • 品牌高端网站制作官网做网站用的小图标
  • 成都网站设计合理柚v米科技泉州建设公司
  • 网页制作与网站建设完全学习手册软件下载网站怎么做
  • linux系统网站空间如何分析网站关键词
  • 以下属于网站页面设计的原则有查询网站空间商
  • 建设银行网站链接网络推广有哪些常见的推广方法
  • 常州网络公司网站图片在线制作加字
  • 漕泾网站建设建立内部网站
  • 海宁市住房和城乡规划建设局网站北京十大装饰装修公司
  • 创新的常州做网站网站页面设计公司电话
  • 建站公司见客户没话说周年庆网站要怎么做
  • 建设银行网站字体建设官方网站
  • 建设部网站人员查询商城网站 没有什么文章 怎样优化
  • wordpress按标签筛选广州seo网站
  • 南宁手机建站公司常德网站开发服务
  • 智能锁东莞网站建设php网站开发需要学什么软件
  • 扒网站样式中国搜索网站排名
  • 网站空间和云服务器建设建材网站费用
  • 公司网站 正式上线wordpress 移动端不显示