BERT for Regression and Mixing BERTs

  1. If the price of a house is the predictive variable and there is one explanatory variable written in text, is this a multiclass Bert problem and how many classes should we allocate? If it is a regression problem, can Bert handle it?

  2. If we use two different types of Bert variation e.g. Bert for English and another Bert for French, can we concatenate the embeddings to perform MLP? Can we run each Bert separately and then save the CLS embeddings in separate files and do the MLP in another file?