All jokes apart, this is a very helpful way of helping end users (who for some reason may still need to decide which models to use - a problem you guys solve with Logic) grasp better how models will approach specific tasks (in this case coding).
It is very clear, don't use Ravenclaw for coding, use it to read books (i.e OCR - have you guys tested its last ocr model? )
Thanks for the post !
How did you compute complexity ?
Interesting, but you didn’t mention which version of each foundational model you tested.
All jokes apart, this is a very helpful way of helping end users (who for some reason may still need to decide which models to use - a problem you guys solve with Logic) grasp better how models will approach specific tasks (in this case coding).
It is very clear, don't use Ravenclaw for coding, use it to read books (i.e OCR - have you guys tested its last ocr model? )