如何用python爬取气象局数据

爬取气象局数据的方法有很多，这里以爬取中国气象局的实时天气数据为例，介绍如何使用Python进行爬虫操作，在开始之前，请确保已经安装了Python环境，以及相关的库requests和BeautifulSoup。

（图片来源网络，侵删）

1、分析目标网站

我们需要访问中国气象局的官方网站（http://www.weather.com.cn/），找到实时天气数据的URL，在这个例子中，我们将爬取北京市的实时天气数据。

2、发送请求

使用requests库发送GET请求，获取网页的HTML内容。

import requests
url = "http://www.weather.com.cn/weather/101010100.shtml"
response = requests.get(url)
html_content = response.text

3、解析HTML

使用BeautifulSoup库解析HTML内容，提取我们需要的数据，在这个例子中，我们需要提取的温度、湿度、风向、风速等信息。

from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
temperature = soup.find('div', {'class': 'tem'}).find('span').text
humidity = soup.find('div', {'class': 'shidu'}).find('span').text
wind_direction = soup.find('div', {'class': 'fengxiang'}).find('li').text
wind_speed = soup.find('div', {'class': 'fengli'}).find('li').text

4、输出结果

将提取到的数据输出到控制台。

print("温度：", temperature)
print("湿度：", humidity)
print("风向：", wind_direction)
print("风速：", wind_speed)

5、完整代码

将以上步骤整合成一个完整的Python脚本。

import requests
from bs4 import BeautifulSoup
def get_weather_data():
    url = "http://www.weather.com.cn/weather/101010100.shtml"
    response = requests.get(url)
    html_content = response.text
    soup = BeautifulSoup(html_content, 'html.parser')
    temperature = soup.find('div', {'class': 'tem'}).find('span').text
    humidity = soup.find('div', {'class': 'shidu'}).find('span').text
    wind_direction = soup.find('div', {'class': 'fengxiang'}).find('li').text
    wind_speed = soup.find('div', {'class': 'fengli'}).find('li').text
    return temperature, humidity, wind_direction, wind_speed
if __name__ == "__main__":
    temperature, humidity, wind_direction, wind_speed = get_weather_data()
    print("温度：", temperature)
    print("湿度：", humidity)
    print("风向：", wind_direction)
    print("风速：", wind_speed)

运行这个脚本，你将看到北京市的实时天气数据，需要注意的是，这个例子仅适用于当前页面的结构，如果网站结构发生变化，可能需要相应地调整代码，频繁爬取网站可能会导致IP被封禁，请合理使用爬虫功能。

原创文章，作者：未希，如若转载，请注明出处：https://www.kdun.com/ask/468693.html

本网站发布或转载的文章及图片均来自网络，其原创性以及文中表达的观点和判断不代表本网站。如有问题，请联系客服处理。