Struggling with Pandas, again

I’ve got a sort of love-hate relationship with Pandas: on one hand, when it works, it is the best thing since sliced bread. But when it don’t work as expected…

One thing I’ve still yet to fully figure out is the damn chained assignments and when its ok to assing new columns with [], .loc, .iloc, .copy or some weird combo of them… However, today I ran into another issue when trying to merge time-series and demographics dataframes into one.

When doing the df.join(), Pandas gave me the helpful error message


ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

without any explanation which columns.. the dataframes have no mutual columns apart from the key, and that is not int64 in either table, so I’ve no idea what it is complaining about. Googling resulted the usual Stackoverflow threads of all sorts of hacks and worksaround, but this time even the people giving answer did not always seem to know why it would work they way they did.

In the end I found my own trick: it works perfectly if you make one of the dataframes a multi-index and use the key as row index:

Of course it only works if you use the multi-index as the input of the join method, and not the other way around:

I should take time to figure out Pandas fully one day, but I feel it will be a very deep rabbit hole…