AI startup Anthropic, backed by Google and Amazon, recently launched a new model called Claude 3 that it claims is better than OpenAI’s GPT-4. While the results from technical benchmarks are impressive, they don’t necessarily reflect the user experience.
Claude 3 is a multimodal model available on the web and through select customer subscriptions. These models have large context windows, processing input data to generate richer responses.
I decided to spend some time with Claude 3. The engineers stated that Claude is a growing lad. Like really children, some of its responses are better than others. That’s a nice way of saying “Results may vary”.
What’s Happening & Why This Matters
Claude Opus is similar to Google’s Gemini 1.5 Pro model by price, seats, and processing-and-response abilities. Unlike rival Gen-AI models, Opus doesn’t have access to the Internet or wider sources. Its knowledge is limited to data up from August 2023.
We tested Opus with a set of questions that covered a broad range of subjects, including current events, history, trivia, medical advice, and therapeutic guidance. Our results were::
- Opus provided high-level background on historical events and was unable to answer questions about more recent events outside its training data.
- Opus was more helpful with historical topics, providing specific information and guidance.
- Opus excelled in providing detailed, accurate answers to trivia questions.
- Opus recommend medications and indicated when to seek medical care for specific symptoms.
- Opus offered general advice for dealing with feelings of sadness and depression.
t/f Summary: What’s Next
while Opus performed well in some areas, it unsurprisingly struggled with questions that required current knowledge of events after Summer 2023. This highlights the limitations of AI models and the importance of understanding their capabilities when using them.
As AI technology continues to advance, we are keen to see how models like Claude 3 evolve to address these challenges.