Genspark has transformed its Deep Research Agent by leveraging Fireworks’ Reinforcement Fine Tuning (RFT) solution to train an open-source model in just four weeks. This powerful solution delivered a 12% quality improvement and 33% more tool calls compared to a frontier closed model, resulting in a significant 50% cost reduction and a superior quality result. This collaboration highlights the power of open models and reinforcement learning in delivering better AI solutions. Check out our case study to learn more: https://lnkd.in/gtuwKAFP
Fireworks AI
Software Development
Redwood City, CA 28,445 followers
Run AI faster, more efficiently, and on your own terms
About us
Fireworks is the fastest way to build, tune, and scale AI on open models. Ship production-ready AI in seconds on our globally distributed cloud infrastructure, optimized for your use case. Fireworks powers production workloads at companies like Uber, Doordash, Notion, and Cursor—delivering 15× faster speed, 4× lower latency, and 4× more concurrency than closed models.
- Website
- 
        
                  
    
      http://fireworks.ai
      
    
  
                  External link for Fireworks AI 
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- Redwood City, CA
- Type
- Privately Held
- Founded
- 2022
- Specialties
- LLMs, Generative AI, artificial intelligence, developer tools, software engineering, and inference
Locations
- 
                  
                    Primary
                  
                Get directionsRedwood City, CA 94063, US 
Employees at Fireworks AI
Updates
- 
              
        
    Interested in hearing directly from the leaders shaping the next wave of AI? We’re bringing together executives from Datadog, Clay, Index Ventures, and MongoDB next Wednesday at Nasdaq to explore what it truly means to Own Your AI. Want to join the conversation? https://lnkd.in/gQH3Uua6 
- 
                  
- 
              
        
    Big week for Fireworks! Yesterday, we announced our $250M Series C, and today, we’re proud to be named one of LinkedIn’s Top Startups in San Francisco. Huge thanks to our customers, partners, and teammates for making Fireworks what it is today. This has been a big week for Fireworks. https://lnkd.in/gnpD3t8W 
- 
                  
- 
              
        
    Unleash the power of multimodal AI with NVIDIA NemotronNano2VL on Fireworks AI! We’re excited to offer Day-0 support for this 12B vision language model, delivering leading accuracy for document intelligence and video understanding. From automating invoice processing to enhancing image search, discover how NVIDIA Nemotron Nano 2 VL can transform your workflows. To help you evaluate its performance, we've developed a custom cookbook for a document processing use case. Dive into our comprehensive blog for more details on getting started: https://lnkd.in/gChsJuWx 
- 
                  
- 
              
        
    AI is no longer an experiment. It’s the battlefield for competitiveness. Fireworks AI has raised $250 million in Series C funding, co-led by Lightspeed Venture and Index Venture and participation from Sequoia Capital and Evantic, bringing our valuation to $4 billion. Fireworks is the AI inference cloud powering production workloads for companies like Uber, Genspark, Retell AI, Shopify, and GitLab. We deliver up to 40x faster inference and 8x lower cost while giving customers full control over their data and IP. This funding will accelerate our mission to build artificial autonomous intelligence: automated model and product co-design for maximum speed, quality, and efficiency. Check it out here: https://lnkd.in/gzdxkADz 
- 
                  
- 
              
        
    Fireworks is first to deliver Day-0 support for Minimax M2, an efficient MoE with 230B total parameters and 10B active parameters for optimal intelligence, speed, and cost. Key advantages for your stack: ---Coding Excellence: Achieve top-tier coding capabilities at 8% the cost of Claude Sonnet, with 2x the speed. ---Powerful Agentic Performance: Excels at stable execution and multi-tool coordination for complex Agentic workflows. ---Scalable Efficiency: Optimal balance of performance and cost for large-scale deployments. Start Developing on Serverless or On-Demand with the Minimax M2 preview on Fireworks and elevate your agentic applications today! https://lnkd.in/gdbEne5h #MinimaxM2 #AgenticAI #FireworksAI 
- 
                  
- 
              
        
    Just when we thought we were the fastest on GLM 4.6 Artificial Analysis Benchmarks, Fireworks exceeded its previous performance in just two days, boosting speed by up to 36%! We boosted our 1000 Input Token speed to 193 TPS, and 100 Input Token speed to 245 TPS. https://lnkd.in/g55NFvkX 
- 
                  
- 
              
        
    Reinforcement Learning is becoming the foundation for how modern AI systems learn, adapt, and optimize in production. Join us on November 20th for a hands-on developer meetup where we’ll dive into how Reinforcement Learning (RL) can help you build smarter, more adaptive systems. We’ll walk through how to apply RL in practice using Eval Protocol + Fireworks AI, including a live demo on improving a Text-to-SQL model with real feedback loops. You’ll leave with a clear understanding of: → What RL really means in a product context → How to measure its ROI and performance impact → Practical steps to start experimenting with it Speakers: Aishwarya Srinivasan · Dylan Huang · Derek Xu Where: Snowflake Silicon Valley AI Hub, Menlo Park When: November 20 Register here: https://lnkd.in/g242xpQS 
- 
                  
- 
              
        
    Tired of complex LLM inference configurations? Fireworks launched 1-click deployment shapes! Choose from "Fast," "Throughput," or "Minimal" presets for dedicated on-demand GPUs, tailored to your needs. Simplify your deployments now! Dive deeper into the details in our latest blog: https://lnkd.in/gWEhWxMC 
- 
              
        
    Fireworks just topped the Artificial Analysis Benchmark with the fastest GLM 4.6 reasoning score at 153 TPS! For teams deploying GLM 4.6 at scale, this means more consistent throughput, lower latency, and reliable uptime, all critical factors for running production inference workloads. Dive into the details and see how we can enhance your applications in long-context reasoning, agentic behavior, code generation, and search capabilities. Ready to elevate your applications? Get started today: https://lnkd.in/gzTdvqfY 
-