I don't think some LMs are more biased towards their teams than others. I think that some LMs are simply more generous concerning the evaluation of the translations. Currently, I'm not in the position to name LMs here, mostly due to the fact that I don't speak most of the languages in the translation category. But what I can say for sure is that I strongly doubt that the very high scores often seen in the translation category are due to the outstanding quality of the translations.
A system which could work would be: Every LM has the possibility to make his own translations who will then be checked by another LM, so there are at least two LMs in each language category. I don't think an extensive questionnaire will lead to better results. Rather the questionnaires should be tangible, meaning that you have to declare how many "sentences" you corrected. This should be divided into three classes of mistakes (heavy, medium, easy).Depending on how many words were translated, this system is flexible. I think the terms who are currently in use are not objective at all.
I know that Utopian tries to bring value to the Steem blockchain, but I don't see the point of evaluating the style of the post, merely the score should be based on the translation in Crowdin. Let me know what you think @elear.