我可以编写一个函数将列合并到一个新列,但在转换为字符串进行合并之前,无法将int列更改为float。
我希望在新合并的列中,这些整数将有挂起的“.000000”。
最后,我试图将合并列作为连接多个键/列上的两个vaex的键。由于vaex似乎只需要一个列/键来连接两个vaex,所以我需要将组合列作为键。
将int更改为float的情况是,一个vaex中的列为int,而另一个vax中的列则为float。
代码如下。
函数new_column_by_column_merging正在工作,但函数new_cocolumn_by_column_erging2不工作。想知道是否有办法让它发挥作用。
import vaex
import pandas as pd
import numpy as np
def new_column_by_column_merging(df, columns=None):
if columns is None:
columns = df.get_column_names()
if type(columns) is str:
df['merged_column_key'] = df[columns]
return df
df['merged_column_key'] = np.array(['']*len(df))
for col in columns:
df['merged_column_key'] = df['merged_column_key'] + '_' + df[col].astype('string')
return df
def new_column_by_column_merging2(df, columns=None):
if columns is None:
columns = df.get_column_names()
if type(columns) is str:
df['merged_column_key'] = df[columns]
return df
df['merged_column_key'] = np.array(['']*len(df))
for col in columns:
try:
df[col] = df[col].astype('float')
except:
print('fail to convert to float')
df['merged_column_key'] = df['merged_column_key'] + '_' + df[col].astype('string')
return df
pandas_df = pd.DataFrame({'Name': ['Tom', 'Joseph', 'Krish', 'John'], 'Last Name': ['Johnson', 'Cameron', 'Biden', 'Washington'], 'Age': [20, 21, 19, 18], 'Weight': [60.0, 61.0, 62.0, 63.0]})
print('pandas_df is')
print(pandas_df)
df = vaex.from_pandas(df=pandas_df, copy_index=False)
df1 = new_column_by_column_merging(df, ['Name', 'Age', 'Weight'])
print('new_column_by_column_merging returns')
print(df1)
df2 = new_column_by_column_merging2(df, ['Name', 'Age', 'Weight'])
print('new_column_by_column_merging2 returns')
print(df2)