What's New in ChatGPT & GPT-5
Real breakthroughs that change how teams work
What Actually Changed
Between August and September 2025, OpenAI released a cascade of updates that fundamentally shift what's possible with AI in the workplace. Here's what matters for enterprise teams.
GPT-5 MAJOR
OpenAI's most advanced model to date, combining reasoning and speed in one unified system with automatic routing.
- State-of-the-art on coding (74.9% SWE-bench)
- 94.6% on AIME 2025 math competition
- 45% fewer hallucinations than GPT-4o
- Auto-switches between fast and thinking modes
One model handles both quick queries and complex analysis without manual switching.
Instant Checkout NEW
Buy products directly in ChatGPT conversations powered by the Agentic Commerce Protocol with Stripe.
- Etsy purchases available now
- 1M+ Shopify merchants coming soon
- No ads - organic, relevance-based results
- Apple Pay, Google Pay integration
Research β decision β purchase in one conversation flow.
ChatGPT Pulse NEW
Proactive research updates delivered as visual cards based on your chats, calendar, and preferences.
- Overnight research synthesis
- Gmail and Google Calendar integration
- Curated daily briefings (5-10 cards)
- Feedback-driven personalization
ChatGPT starts the conversation - you wake up informed.
Sora 2 NEW
Advanced video and audio generation with realistic physics, synchronized dialogue, and a social app.
- Accurate physics simulation
- Synchronized audio and dialogue
- "Cameos" feature for personal videos
- Sora iOS app with social feed
Training content, concept videos, and storyboarding at speed.
GDPval Benchmark NEW
Evaluation framework measuring AI on 1,320 real-world tasks across 44 occupations.
- Tasks from 9 GDP-contributing sectors
- Created by experts with 14+ years experience
- Claude Opus 4.1 leads at 47.6% win rate
- GPT-5 at 38.8% - 3x improvement over GPT-4o
Framework for measuring task-level ROI and adoption.
Enhanced Infrastructure
Core improvements making everything more reliable and capable.
- 400K token context window
- Projects for organized workflows
- Memory that learns preferences
- More connectors (Drive, Gmail, Calendar)
Less context-switching, more continuity across sessions.
GPT-5: What Makes It Different
Released August 7, 2025, GPT-5 unifies OpenAI's advances in reasoning, coding, and multimodal understanding into one intelligent system.
| Capability | GPT-4o / o3 | GPT-5 |
|---|---|---|
| Model Selection | Manual switching required | Auto-router chooses optimal model per query |
| Reasoning | Separate o-series models | Integrated thinking mode with status updates |
| Coding | 69.1% SWE-bench Verified | 74.9% SWE-bench Verified, 88% Aider Polyglot |
| Math | Various specialized models | 94.6% AIME 2025, 88.4% GPQA with extended reasoning |
| Hallucinations | Baseline | 45% reduction with search, 80% with thinking |
| Context Window | 128K-200K tokens | Up to 400K tokens |
| Personality | Sometimes overly agreeable | Less sycophantic (14.5% β 6%), more natural |
New Features in Detail
How It Works
- Ask shopping questions in natural language
- Get organic, unsponsored product recommendations
- Click "Buy" to complete purchase without leaving chat
- Confirm with Apple Pay, Google Pay, Stripe, or card
- Merchant handles fulfillment through existing systems
Agentic Commerce Protocol
Open-source protocol co-developed with Stripe that enables:
- AI agents, people, and businesses to transact together
- Merchants to integrate checkout into conversational AI
- Secure payment handling with Shared Payment Tokens
- Privacy-preserving transaction flows
What Pulse Does
- Runs overnight research based on your conversations and calendar
- Delivers 5-10 personalized visual cards each morning
- Connects to Gmail for important messages and Google Calendar for upcoming events
- Learns from your feedback (thumbs up/down) to improve relevance
- Cards expire daily unless saved or discussed
Enterprise Applications
Teams are using Pulse for:
- Executive Briefings: Curated daily summaries for leadership
- Project Tracking: Overnight updates on active initiatives
- Competitive Intelligence: Monitor specific companies or markets
- Meeting Prep: Context and agendas for upcoming meetings
Key Improvements
- Physics Accuracy: Realistic object interactions (e.g., basketballs actually bounce)
- Synchronized Audio: Dialogue and sound effects match video timing
- Controllability: Better instruction following and style consistency
- Realism: Higher quality, more photorealistic outputs
Sora App Features
- Cameos: Insert verified users into AI-generated videos
- Remix: Modify and build on existing videos
- Feed: TikTok-style discovery optimized for creation, not consumption
- Safety: Watermarking, C2PA provenance, content moderation
Enterprise Use Cases
- Training Content: Quick explainer videos and product demos
- Concept Testing: Rapid storyboarding for ads and campaigns
- Internal Comms: Engaging announcements and updates
- Documentation: Visual tutorials and process guides
Important: Currently invite-only on iOS (U.S. and Canada). Android and broader rollout coming soon. All generated videos include watermarks and provenance data.
Business Impact: The GDPval Story
GDPval (Gross Domestic Product Value) is OpenAI's new benchmark that measures AI performance on economically valuable tasks across 44 occupations. Here's why it matters.
The Methodology
- 1,320 tasks from 9 major GDP-contributing sectors
- Tasks created by professionals with 14+ years experience
- Real deliverables: briefs, blueprints, care plans, code
- Blind expert evaluation comparing AI vs. human outputs
The Results
- Claude Opus 4.1: 47.6% win/tie rate (best overall)
- GPT-5: 38.8% win/tie rate (strong on formatting)
- GPT-4o: 13.7% - showing 3x improvement in 15 months
- Models complete tasks ~100x faster and cheaper
What This Means
- AI approaching expert quality on many knowledge work tasks
- Still needs human oversight for ~50% of professional work
- Best for routine, well-defined deliverables
- Human expertise still critical for judgment and nuance
Team Implementation Playbook
π― Week 1: Quick Wins
- Set up ChatGPT Plus or Pro accounts for pilot team
- Enable Pulse for 2-3 key executives with 3 topics each
- Connect Google Calendar and Gmail with least-privilege access
- Document initial use cases and gather feedback
π Weeks 2-4: Measure Impact
- Track: Time to first usable draft, re-prompts per task
- Monitor: Hallucination rate, citation accuracy
- Log: Copy-paste operations avoided, manual lookups saved
- Survey: Team satisfaction and productivity perception
π Governance Essentials
- Memory: Only stable, non-sensitive facts; monthly reviews
- Connectors: Least-privilege scopes; approval for outbound actions
- Verification: Mandatory browsing for time-sensitive claims
- Incidents: Abstain β verify β escalate workflow with logging
π Scale Strategically
- Start with high-impact, low-risk use cases
- Build internal champions who can train others
- Create prompt libraries for common tasks
- Establish feedback loops for continuous improvement
Frequently Asked Questions
Not anymore. GPT-5 includes an auto-router that automatically chooses between fast and thinking modes based on query complexity. You can override this for specific use cases (audits, complex analysis), but most users never need to.
Yes, but significantly less often. With web search enabled, GPT-5 has 45% fewer factual errors than GPT-4o. With extended reasoning enabled, it's 80% fewer errors than o3. For anything consequential, always request sources and verify time-sensitive information.
Currently Pulse is in preview for ChatGPT Pro users on mobile (iOS). OpenAI plans to expand to Plus users, then eventually to all users. The feature works best when you connect Gmail and Google Calendar for context.
Instant Checkout is available now to U.S. ChatGPT users (Free, Plus, and Pro) for Etsy purchases. Over 1 million Shopify merchants (including brands like Glossier, Skims, Spanx) are coming soon. Simply ask shopping-related questions and look for the "Buy" button on compatible products.
According to OpenAI's own GDPval benchmark, Claude Opus 4.1 currently performs better overall (47.6% vs 38.8% win rate), particularly excelling at aesthetics and formatting. GPT-5 is stronger at instruction-following and has better integration with OpenAI's ecosystem (Pulse, Instant Checkout, Sora). Both are approaching expert-level on many tasks.