For 1, my list of current LLMs that I test on is: 1. Google Gemini Pro 2. OpenAI GPT4o and GPT4-Turbo 3. Claude API 4. Google Vertex Claude 5. Mistral Large via API . Right now I am monkey-patching in support for Google Vertex Claude (which people use because you the Claude API itself is extremely unreliable).