Magic, an AI startup creating models to generate code and automate a range of software development tasks, has raised a large tranche of cash from investors including ex-Google CEO Eric Schmidt.
In a blog post on Thursday, Magic said that it closed a $320 million fundraising round with contributions from Schmidt, as well as Alphabet’s CapitalG, Atlassian, Elad Gil, Jane Street, Nat Friedman & Daniel Gross, Sequoia, and others. The funding brings the company’s total raised to nearly half a billion dollars ($465 million), catapulting it into a cohort of better-funded AI coding startups whose members include Codeium, Cognition, Poolside, Anysphere and Augment. (Interestingly, Schmidt is backing Augment, too.)
In July, Reuters reported that Magic was seeking to raise over $200 million at a $1.5 billion valuation. Evidently, the round came in above expectations, although the startup’s current valuation couldn’t be ascertained; Magic was valued at $500 million in February.
Magic also today announced a partnership with Google Cloud to build two “supercomputers” on Google Cloud Platform. One — Magic-G4 — will be made up of Nvidia H100 GPUs, while the other — Magic G5 — will comprise Nvidia’s next-gen Blackwell chips. (GPUs, thanks to their ability to run many computations in parallel, are commonly used to train and run generative AI models.)
Magic says it aims to scale the latter cluster to “tens of thousands” of GPUs over time.
“We are excited to partner with Google and Nvidia to build our next-gen AI supercomputer on Google Cloud,” Magic co-founder and CEO Eric Steinberger said in a statement. “Nvidia’s [Blackwell] system will greatly improve inference and training efficiency for our models, and Google Cloud offers us the fastest timeline to scale, and a rich ecosystem of cloud services.”
Steinberger and Sebastian De Ro co-founded Magic in 2022. In a previous interview, Steinberger told Techcrunch that he was inspired by the potential of AI at a young age; in high school, he and his friends wired up the school’s computers for machine-learning algorithm training.
That experience planted the seeds for Steinberger’s computer science Bachelor’s program at Cambridge (he dropped out after a year) and, later, a job at Meta as an AI researcher. De Ro hails from German business process management firm FireStart, where he worked his way up to the role of CTO.
Magic develops AI-driven tools (not yet for sale) designed to help software engineers write, review, debug and plan code changes. The tools operate like an automated pair programmer, attempting to understand and continuously learn more about the context of various coding projects.
Lots of platforms do the same, including the elephant in the room GitHub Copilot. But one of Magic’s innovations lie in its models’ ultra-long context windows. It calls the models’ architecture “Long-term Memory Network,” or “LTM” for short.
A model’s context, or context window, refers to input data (e.g. code) that the model considers before generating output (e.g. additional code). A simple question — “Who won the 2020 U.S. presidential election?” — can serve as context, as can a movie script, show or audio clip.
As context windows grow, so does the size of the documents — or codebases, as the case may be — being fit into them. Long context can prevent models from “forgetting” the content of recent docs and data, and from veering off topic and extrapolating wrongly.
Magic claims its latest model, LTM-2-mini, has a 100 million-token context window. (Tokens are subdivided bits of raw data, like the syllables “fan,” “tas” and “tic” in the word “fantastic.”) 100 million tokens is equivalent to around 10 million lines of code — or 750 novels. And it’s by far the largest context window of any commercial model; the next-largest are Google’s Gemini flagship models at 2 million tokens.
Magic says that — thanks to its long context — LTM-2-mini was able to implement a password strength meter for an open source project and create a calculator using a custom UI framework pretty much autonomously.
The company’s now in the process of training a larger version of that model.