当前位置：首页 > news >正文

网站备案完了怎么做c2c模式特点

news 2025/12/20 8:58:55

网站备案完了怎么做,c2c模式特点,网站建设需不需要编程,模仿图库网站开发文章目录介绍不规则时间戳的单变量时间预测不规则时间戳的外生变量时间预测介绍在处理时间序列数据时#xff0c;时间戳的频率是一个关键因素#xff0c;可以对预测结果产生重大影响。像每日、每周或每月这样的常规频率很容易处理。然而#xff0c;像工作日这样的不规则… 文章目录介绍不规则时间戳的单变量时间预测不规则时间戳的外生变量时间预测介绍在处理时间序列数据时时间戳的频率是一个关键因素可以对预测结果产生重大影响。像每日、每周或每月这样的常规频率很容易处理。然而像工作日这样的不规则频率不包括周末对于时间序列预测方法来说可能是具有挑战性的。我们的预测方法可以处理这种不规则的时间序列数据只要您指定了序列的频率。例如在工作日的情况下频率应该传递为’B’。如果没有这个参数方法可能无法自动检测频率特别是当时间戳是不规则的时候。 # Import the colab_badge module from the nixtlats.utils package from nixtlats.utils import colab_badge colab_badge(docs/tutorials/8_irregular_timestamps)from fastcore.test import test_eq, test_fail, test_warns from dotenv import load_dotenv# 导入load_dotenv函数用于加载.env文件中的环境变量 load_dotenv()True# 导入pandas库用于数据处理 import pandas as pd# 导入TimeGPT模块 from nixtlats import TimeGPT /home/ubuntu/miniconda/envs/nixtlats/lib/python3.11/site-packages/statsforecast/core.py:25: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.htmlfrom tqdm.autonotebook import tqdm# 创建一个TimeGPT对象并传入一个参数token用于验证身份 # 如果没有提供token参数则默认使用环境变量中的TIMEGPT_TOKENtimegpt TimeGPT(token my_token_provided_by_nixtla )# 导入TimeGPT模型timegpt TimeGPT() # 创建TimeGPT对象的实例不规则时间戳的单变量时间预测第一步是获取您的时间序列数据。数据必须包括时间戳和相关的值。例如您可能正在处理股票价格您的数据可能如下所示。在这个例子中我们使用OpenBB。 # 从指定URL读取数据集 df_fed_test pd.read_csv(https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/openbb/fed.csv)# 使用pd.testing.assert_frame_equal函数对两个预测结果进行比较 # 第一个预测结果使用默认的频率每日 # 第二个预测结果使用频率为每周 # 比较的指标为预测结果的FF列并设置置信水平为90% pd.testing.assert_frame_equal(timegpt.forecast(df_fed_test, h12, target_colFF, level[90]),timegpt.forecast(df_fed_test, h12, target_colFF, freqW, level[90]) )INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes... INFO:nixtlats.timegpt:Inferred freq: W-WED WARNING:nixtlats.timegpt:The specified horizon h exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon. INFO:nixtlats.timegpt:Restricting input... INFO:nixtlats.timegpt:Calling Forecast Endpoint... INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes... INFO:nixtlats.timegpt:Inferred freq: W-WED WARNING:nixtlats.timegpt:The specified horizon h exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon. INFO:nixtlats.timegpt:Restricting input... INFO:nixtlats.timegpt:Calling Forecast Endpoint...# 从指定的URL读取CSV文件并将其存储在名为pltr_df的DataFrame中 pltr_df pd.read_csv(https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/openbb/pltr.csv)# 将date列转换为日期时间格式并将结果存储在date列中 pltr_df[date] pd.to_datetime(pltr_df[date])# 显示数据集的前几行 pltr_df.head()dateOpenHighLowCloseAdj CloseVolumeDividendsStock Splits02020-09-3010.0011.419.119.509.503385844000.00.012020-10-019.6910.109.239.469.461242976000.00.022020-10-029.069.288.949.209.20550183000.00.032020-10-059.439.498.929.039.03363169000.00.042020-10-069.0410.188.909.909.90908640000.00.0 让我们看看这个数据集有不规则的时间戳。来自pandas的DatetimeIndex的dayofweek属性返回星期几星期一0星期日6。因此检查dayofweek4实际上是检查日期是否落在周六5或周日6这通常是非工作日周末。 # 统计pltr_df中日期的星期几大于4的数量 (pltr_df[date].dt.dayofweek 4).sum()0我们可以看到时间戳是不规则的。让我们检查“Close”系列。 # 使用timegpt模块中的plot函数绘制pltr_df数据集中的日期(date)与收盘价(Close)之间的关系图 timegpt.plot(pltr_df, time_coldate, target_colClose)要预测这些数据您可以使用我们的forecast方法。重要的是记得使用freq参数指定数据的频率。在这种情况下它应该是’B’表示工作日。我们还需要定义time_col来选择系列的索引默认为ds以及target_col来预测我们的目标变量这种情况下我们将预测Close。 # 预测函数test_fail()用于测试timegpt.forecast()函数的功能 # timegpt.forecast()函数用于根据给定的时间序列数据进行预测 # 该函数的参数包括 # - df时间序列数据的DataFrame # - h预测的时间步数 # - time_col时间列的名称 # - target_col目标列的名称# 在这个测试中我们使用pltr_df作为输入数据进行预测 # 预测的时间步数为14 # 时间列的名称为date # 目标列的名称为Close# 预测结果中应该包含frequency但是由于某种原因预测失败了 test_fail(lambda: timegpt.forecast(dfpltr_df, h14,time_coldate, target_colClose,),containsfrequency )INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes...# 导入所需的模块和函数# 调用forecast函数传入时间序列数据的DataFrame、预测步长、频率、时间列的列名和目标列的列名 fcst_pltr_df timegpt.forecast(dfpltr_df, h14, freqB,time_coldate, target_colClose, )INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes... WARNING:nixtlats.timegpt:The specified horizon h exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon. INFO:nixtlats.timegpt:Calling Forecast Endpoint...# 查看数据集的前几行 fcst_pltr_df.head()dateTimeGPT02023-09-2514.68842712023-09-2614.74279822023-09-2714.78124032023-09-2814.82415642023-09-2914.795214 记住对于工作日频率是’B’。对于其他频率您可以参考pandas偏移别名文档https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases。通过指定频率您可以帮助预测方法更好地理解数据中的模式从而得到更准确可靠的预测结果。让我们绘制由TimeGPT生成的预测结果。 # 使用timegpt.plot函数绘制图表 # 参数pltr_df是包含股票价格数据的DataFrame # 参数fcst_pltr_df是包含预测股票价格数据的DataFrame # 参数time_col指定时间列的名称这里是date # 参数target_col指定目标列的名称这里是Close # 参数max_insample_length指定用于训练模型的最大样本数量这里是90 timegpt.plot(pltr_df, fcst_pltr_df, time_coldate,target_colClose,max_insample_length90, )您还可以使用level参数将不确定性量化添加到您的预测中。 # 导入所需的模块和函数# 使用timegpt.forecast函数进行时间序列预测 # 参数df为输入的数据框pltr_df为待预测的数据框 # 参数h为预测的时间步长这里设置为42 # 参数freq为数据的频率这里设置为工作日B # 参数time_col为时间列的名称这里设置为date # 参数target_col为目标列的名称这里设置为Close # 参数add_history为是否将历史数据添加到预测结果中这里设置为True # 参数level为置信水平这里设置为[40.66, 90] fcst_pltr_levels_df timegpt.forecast(dfpltr_df, h42, freqB,time_coldate, target_colClose,add_historyTrue,level[40.66, 90], )INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes... WARNING:nixtlats.timegpt:The specified horizon h exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon. INFO:nixtlats.timegpt:Calling Forecast Endpoint... INFO:nixtlats.timegpt:Calling Historical Forecast Endpoint...# 绘制时间序列图 # 参数 # pltr_df: 包含时间序列数据的DataFrame # fcst_pltr_levels_df: 包含预测水平数据的DataFrame # time_col: 时间列的列名 # target_col: 目标列的列名 # level: 预测水平的取值范围 timegpt.plot(pltr_df, fcst_pltr_levels_df, time_coldate,target_colClose,level[40.66, 90], )如果你想预测另一个变量只需更改“target_col”参数。现在让我们预测“Volume” # 导入所需模块和函数# 使用timegpt.forecast函数进行时间序列预测 # 参数df为输入的时间序列数据pltr_df为输入的数据框 # 参数h为预测的步长这里设置为14 # 参数freq为时间序列的频率这里设置为B表示工作日 # 参数time_col为时间列的名称这里设置为date # 参数target_col为目标列的名称这里设置为Volume fcst_pltr_df timegpt.forecast(dfpltr_df, h14, freqB,time_coldate, target_colVolume, )# 使用timegpt.plot函数绘制时间序列和预测结果的图形 # 参数pltr_df为输入的时间序列数据这里是原始数据 # 参数fcst_pltr_df为预测结果数据这里是预测的结果 # 参数time_col为时间列的名称这里设置为date # 参数max_insample_length为显示的最大样本长度这里设置为90 # 参数target_col为目标列的名称这里设置为Volume timegpt.plot(pltr_df, fcst_pltr_df, time_coldate,max_insample_length90,target_colVolume, )INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes... WARNING:nixtlats.timegpt:The specified horizon h exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon. INFO:nixtlats.timegpt:Calling Forecast Endpoint...但是如果我们想同时预测所有时间序列呢我们可以通过重新塑造我们的数据框来实现。目前数据框是宽格式每个序列是一列但我们需要将它们转换为长格式一个接一个地堆叠。我们可以使用以下方式实现 # 将pltr_df进行重塑使得每一行代表一个观测值 # id_vars参数指定date列为标识变量即不需要重塑的列 # var_name参数指定新生成的列名为series_id pltr_long_df pd.melt(pltr_df, id_vars[date],var_nameseries_id )# 显示数据集的前几行 pltr_long_df.head()dateseries_idvalue02020-09-30Open10.0012020-10-01Open9.6922020-10-02Open9.0632020-10-05Open9.4342020-10-06Open9.04 然后我们只需简单地调用forecast方法并指定id_col参数。 # 导入所需的模块和函数已在代码中无需额外的import语句# 调用timegpt模块中的forecast函数对pltr_long_df数据进行预测 # 参数df表示要进行预测的数据框pltr_long_df为待预测的数据框 # 参数h表示预测的时间步数这里设置为14即预测未来14个时间步的值 # 参数freq表示数据的频率这里设置为B表示工作日频率 # 参数id_col表示数据框中表示序列ID的列名这里设置为series_id # 参数time_col表示数据框中表示时间的列名这里设置为date # 参数target_col表示数据框中表示目标变量的列名这里设置为value fcst_pltr_long_df timegpt.forecast(dfpltr_long_df, h14, freqB,id_colseries_id, time_coldate, target_colvalue, )INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes... WARNING:nixtlats.timegpt:The specified horizon h exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon. INFO:nixtlats.timegpt:Calling Forecast Endpoint...# 显示 DataFrame 的前五行数据 fcst_pltr_long_df.head()series_iddateTimeGPT0Adj Close2023-09-2514.6884271Adj Close2023-09-2614.7427982Adj Close2023-09-2714.7812403Adj Close2023-09-2814.8241564Adj Close2023-09-2914.795214 然后我们可以预测“开盘价”系列 # 使用timegpt.plot函数绘制图表 # 参数pltr_long_df是包含原始数据的DataFrame # 参数fcst_pltr_long_df是包含预测数据的DataFrame # 参数id_col指定数据中用于标识系列的列名 # 参数time_col指定数据中用于表示时间的列名 # 参数target_col指定数据中用于表示目标值的列名 # 参数unique_ids是一个列表包含需要绘制图表的唯一系列的标识符 # 参数max_insample_length指定用于训练模型的最大样本长度 timegpt.plot(pltr_long_df, fcst_pltr_long_df, id_colseries_id,time_coldate,target_colvalue,unique_ids[Open],max_insample_length90, )不规则时间戳的外生变量时间预测在时间序列预测中我们预测的变量通常不仅受到它们过去的值的影响还受到其他因素或变量的影响。这些外部变量被称为外生变量它们可以提供重要的额外背景信息可以显著提高我们的预测准确性。其中一个因素也是本教程的重点是公司的营收。营收数据可以提供公司财务健康和增长潜力的关键指标这两者都可以对其股票价格产生重大影响。我们可以从openbb获取这些数据。 # 从指定的 URL 中读取 CSV 文件并将其存储在名为 revenue_pltr 的数据框中 revenue_pltr pd.read_csv(https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/openbb/revenue-pltr.csv)# 获取revenue_pltr中totalRevenue列的第一个值 value revenue_pltr[totalRevenue].iloc[0]# 判断value是否为float类型且包含M if not isinstance(value, float) and M in value:# 定义一个函数convert_to_float用于将字符串转换为浮点数def convert_to_float(val):# 如果val中包含M则将M替换为空字符串并将结果乘以1e6表示百万if M in val:return float(val.replace( M, )) * 1e6# 如果val中包含K则将K替换为空字符串并将结果乘以1e3表示千elif K in val:return float(val.replace( K, )) * 1e3# 如果val中既不包含M也不包含K则直接将val转换为浮点数else:return float(val)# 将revenue_pltr中totalRevenue列的每个值都应用convert_to_float函数进行转换revenue_pltr[totalRevenue] revenue_pltr[totalRevenue].apply(convert_to_float)# 显示数据的最后几行 revenue_pltr.tail()fiscalDateEndingtotalRevenue52022-06-30473010000.062022-09-30477880000.072022-12-31508624000.082023-03-31525186000.092023-06-30533317000.0 我们在数据集中观察到的第一件事是我们只能获得到2023年第一季度结束的信息。我们的数据以季度频率表示我们的目标是利用这些信息来预测超过这个日期的未来14天的每日股票价格。然而为了准确计算包括收入作为外生变量的这种预测我们需要了解未来收入的值。这是至关重要的因为这些未来收入值可以显著影响股票价格。由于我们的目标是预测未来14天的每日股票价格我们只需要预测即将到来的一个季度的收入。这种方法使我们能够创建一个连贯的预测流程其中一个预测的输出收入被用作另一个预测股票价格的输入从而利用所有可用的信息以获得最准确的预测。 # 定义一个变量fcst_pltr_revenue用于存储预测结果 # 调用timegpt库中的forecast函数对revenue_pltr数据进行预测 # 预测的时间跨度为1时间列为fiscalDateEnding目标列为totalRevenue fcst_pltr_revenue timegpt.forecast(revenue_pltr, h1, time_colfiscalDateEnding, target_coltotalRevenue)INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes... INFO:nixtlats.timegpt:Inferred freq: Q-DEC INFO:nixtlats.timegpt:Calling Forecast Endpoint...# 查看数据集的前几行 fcst_pltr_revenue.head()fiscalDateEndingTimeGPT02023-09-30540005888 继续上次的内容我们预测流程中的下一个关键步骤是调整数据的频率以匹配股票价格的频率股票价格的频率是以工作日为基准的。为了实现这一点我们需要对历史和未来预测的收入数据进行重新采样。我们可以使用以下代码来实现这一点 # 将revenue_pltr数据框中的fiscalDateEnding列转换为日期格式 revenue_pltr[fiscalDateEnding] pd.to_datetime(revenue_pltr[fiscalDateEnding]) revenue_pltr revenue_pltr.set_index(fiscalDateEnding).resample(B).ffill().reset_index()重要提示需要强调的是在这个过程中我们将相同的收入值分配给给定季度内的所有天数。这种简化是必要的因为季度收入数据和每日股票价格数据之间的粒度差异很大。然而在实际应用中对这个假设要谨慎对待是至关重要的。季度收入数据对每日股票价格的影响在季度内可以根据一系列因素包括市场预期的变化、其他财经新闻和事件而有很大的差异。在本教程中我们使用这个假设来说明如何将外部变量纳入我们的预测模型但在实际情况下根据可用数据和具体用例可能需要采用更细致的方法。然后我们可以创建完整的历史数据集。 # 合并数据框 # 将revenue_pltr数据框的fiscalDateEnding列重命名为date列并与pltr_df数据框进行合并 pltr_revenue_df pltr_df.merge(revenue_pltr.rename(columns{fiscalDateEnding: date}))# 显示DataFrame的前几行数据 pltr_revenue_df.head()dateOpenHighLowCloseAdj CloseVolumeDividendsStock SplitstotalRevenue02021-03-3122.50000023.85000022.37999923.29000123.290001614585000.00.0341234000.012021-04-0123.95000123.95000122.73000023.07000023.070000517888000.00.0341234000.022021-04-0523.78000124.45000123.34000023.44000123.440001653743000.00.0341234000.032021-04-0623.54999923.61000122.83000023.27000023.270000419335000.00.0341234000.042021-04-0723.00000023.54999922.80999922.90000022.900000327662000.00.0341234000.0 计算未来收入的数据框架 # 设置变量horizon为14表示水平线的位置为14 horizon 14# 导入numpy库用于进行科学计算和数组操作 import numpy as np# 创建一个DataFrame对象future_df # 该DataFrame包含两列date和totalRevenue # date列使用pd.date_range函数生成从pltr_revenue_df的最后一个日期开始生成horizon 1个日期频率为工作日B # 从生成的日期中取出后horizon个日期作为future_df的date列 # totalRevenue列使用np.repeat函数生成将fcst_pltr_revenue的第一个元素的TimeGPT值重复horizon次 future_df pd.DataFrame({date: pd.date_range(pltr_revenue_df[date].iloc[-1], periodshorizon 1, freqB)[-horizon:],totalRevenue: np.repeat(fcst_pltr_revenue.iloc[0][TimeGPT], horizon) })# 查看数据集的前几行 future_df.head()datetotalRevenue02023-07-0354000588812023-07-0454000588822023-07-0554000588832023-07-0654000588842023-07-07540005888 然后我们可以使用X_df参数在forecast方法中传递未来的收入。由于收入在历史数据框中该信息将被用于模型中。 # 使用timegpt模块中的forecast函数对pltr_revenue_df数据进行预测 # 预测的时间范围为horizon # 频率为B即每个工作日 # 时间列为date # 目标列为Close # 附加的特征数据为future_df fcst_pltr_df timegpt.forecast(pltr_revenue_df, hhorizon, freqB,time_coldate, target_colClose,X_dffuture_df, )INFO:nixtlats.timegpt:Validating inputs... INFO:nixtlats.timegpt:Preprocessing dataframes... WARNING:nixtlats.timegpt:The specified horizon h exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon. INFO:nixtlats.timegpt:Calling Forecast Endpoint...# 绘制时间序列预测图 # 参数说明 # pltr_revenue_df: 公司收入数据的DataFrame # fcst_pltr_df: 预测的公司收入数据的DataFrame # id_col: 数据中表示系列ID的列名 # time_col: 数据中表示时间的列名 # target_col: 数据中表示目标变量的列名 # max_insample_length: 用于训练模型的最大样本长度 timegpt.plot(pltr_revenue_df, fcst_pltr_df, id_colseries_id,time_coldate,target_colClose,max_insample_length90, )我们还可以看到收入的重要性。 timegpt.weights_x.plot.barh(xfeatures, yweights)Axes: ylabelfeatures

查看全文

http://www.pierceye.com/news/16326/