Back to Feed
I Field Tested Gemini 3.5 Flash: Fast Boi, Smol Brain.
The Signal
Google’s recent release of Gemini 3.5 Flash and their agentic 'Spark' mode has sparked a debate over whether the models offer superior value or merely aggressive marketing. Tech reviewer Matt Vid Pro evaluates these tools against OpenAI’s GPT-5.5, identifying a central tension between Google’s speed-optimized, mid-priced infrastructure and the more dependable, nuanced output of premium competitors.
The Case
- Gemini 3.5 Flash is notably fast and cost-effective, but it frequently failed during complex creative and coding tasks in the speaker's tests, often producing generic results or crashing.
- GPT-5.5 consistently outperformed Gemini 3.5 Flash by generating more reliable, functional, and detailed code in first-pass attempts, including complex simulations involving water physics and interactive NPCs.
- Google’s new Spark and anti-gravity agentic workflows show promise by integrating directly with Google Drive, yet they suffer from inconsistency, lag, and incomplete task completion that render them less reliable than the current market preference, Codeex.
- The speaker notes that relying on benchmarks—which show mixed results between models—is insufficient; he asserts that 'oddball' consumer-facing tasks reveal significant limitations in Google’s models that static scores fail to capture.
- The pricing model for Gemini 3.5 Flash has climbed to $1.50 per million input tokens and $9 per million output tokens, a three-fold increase that forces it into a more expensive 'mid-range' bracket rather than the 'dirt cheap' tier of previous versions.
- Certain failures, such as models defaulting to image generation or producing JavaScript errors, appear to be tied as much to buggy UI and tool routing as they are to the underlying AI intelligence.
The 1 Minute Signal Take
The evidence suggests that while Google's new model is a capable utility for light, high-speed tasks, it is currently outclassed by GPT-5.5 for high-stakes, iterative, or complex agentic work. Skip this video if you only need a high-level briefing, but watch it if you want to see the specific, failed-launch visuals and the 'globe simulation' coding breakdown that underscore the gap between marketing claims and functional reality.
Time saved:
Tags
Back to Feed
