Welcome to the Future

AI, Devin, and the future of frontend development.

Issue 18 /

Hey friends đź‘‹ First of all, thank you so much to everyone who replied to the last newsletter and joined me in the fight against the anti-spam bots. I loved hearing from all of you, and I hope that you'll continue to hit reply every once in a while, even if it's just to say hi :)

Today's issue is about something we don't typically talk about in the newsletter—AI and its implications on the future of frontend development.

Don't worry, Frontend at Scale is not turning into an AI newsletter anytime soon (unless this issue performs really well, in which case get ready for the first edition of AI at Scale in a couple of weeks), but I think it's important to talk about it at least once. Let's jump right in!

FRONTEND DEVELOPMENT

Welcome to the Future

Photo by Pavel Danilyuk on Pexels.com

Unless you’ve been completely away from social media this past week (in which case please send me some tips about how to be more like you), you’ve probably heard about Devin by now—the first “AI software engineer.”

Like most people who write code for a living, the announcement of Devin didn’t sit too well with me, and had me googling jobs AI won’t steal thinking very hard about the future of our discipline.

But Devin wasn’t the only AI that caught my attention this week. By the time it was announced, I was already in alert mode after learning about a freshly published research paper titled Design2Code: How Far Are We From Automating Front-End Engineering?

Now, I don’t know about you, but as a frontend developer who would very much like to still have a job next year, that is a question I would love to know the answer to. So today I want to spend some time going over this research, tell you my thoughts about it, and maybe speculate a little bit about what the future of frontend might look like.

So, how far are we?

Let’s start by talking about the paper, which was published a couple of weeks ago by a group of researchers from Stanford University, Georgia Tech, Microsoft, and Google.

In a nutshell, the research tried to determine how accurate LLMs are when translating a visual design (a screenshot of a website) to a code implementation in HTML and CSS. Here’s some of what they accomplished as part of their research:

  • They created a benchmark of 484 “real world” web pages, after filtering down a dataset of 128 thousand pages based on their length and content.
  • They created a set of automated metrics to compare the code produced by the LLMs against the original designs.
  • They fine-tuned an open-source model called Design2Code-18B, trained specifically to perform, well, design-to-code translations.
  • They ran their benchmark of real-world pages against 5 different models, and they compared how well each one performed, both using their automated metrics and human evaluators.

The five models they ran their benchmark on include:

  • 2 commercial models (GPT-4V and Google's Gemini Pro Vision)
  • 2 existing open-source models (WebSight VLM-8B and CogAgent-Chat-18B)
  • The open-source model they fine-tuned as part of their research (Design2Code-18B)

For each of the commercial models, they ran the benchmark using three different methods: Direct Prompting, Text-Augmented Prompting, and Self-Revision Prompting—which is a 2-pass approach in which the output of the first pass is given back as input to the LLM so it can (hopefully) learn from it.

There’s a lot more detail about these methods in the paper. If you’re an LLM aficionado, I recommend giving chapter 3 a read, where you’ll find the exact prompt they used for the Direct Prompting approach.

The prompt used to instruct the LLMs to perform the design-to-code task

So what are the results of the LLM showdown? GPT-4V was the clear winner, getting the highest scores by both the automated and human evaluations.

The human evaluators thought that the GPT-4V creations could replace the original designs in 49% of cases. And perhaps more surprisingly, they also thought that the AI versions looked even better than the original designs in 64% of cases.

These are really impressive results. And after reviewing a bunch of examples myself (you can download the screenshots and code from the project’s repo), I can tell you that in the cases where the AI didn’t nail the implementation, it was pretty darn close.

Here’s an example:

An example website from the Design2Code benchmark (left) alongside the resulting GPT-4V implementation (right)

The HTML and CSS code looks quite decent as well. There are some gaps in the use of semantic tags, and in some cases, it looked like the AI couldn’t figure out how to use CSS Grid instead of Flexbox (I guess that’s one thing we have in common), but for the most part, the code was quite clean and minimal.

So does this mean it’s all over for us frontend folks? Should we start looking for a career change? Well, not exactly.

First of all, even though the benchmark consisted of real-world websites, the designs of the test pages were extremely simplistic. In fact, the example above is one of the most complex ones I could find in the benchmark, and it’s still quite a simple layout.

This is likely because even something as simple as a background image or a gradient could throw the AI off and produce inconsistent results. The evaluations would probably look very different if we had asked the LLMs to implement a site like Stripe.com.

Additionally, the research was focused on static implementations only. We know that LLMs can generate dynamic websites with JavaScript based on prompts, but we don’t know yet how accurate these implementations are in real-world scenarios—or how consistently they work across multiple pages.

So while the progress is quite impressive, when it comes to automating all aspects of a typical frontend engineer’s job, I’d say AI still has a long way to go.

The Future

AI researchers are aware of the limits of LLMs when it comes to automating frontend development, so we can expect research to continue pushing these limits forward. Here are some of the suggestions for future research from the Design2Code paper:

  • Experimenting with prompting techniques to handle more sophisticated designs. For example, breaking down the work into smaller chunks instead of asking it to implement the entire thing at once.
  • Hooking up tools like Figma to an LLM to generate component libraries and more complex (and interactive) layouts.
  • Experimenting with creating dynamic websites with JavaScript. We know that LLMs can generate simple websites like calculators or joke apps, but we’re yet to see how well they can stitch together an entire SPA.

But even though AI might start to replace some of our responsibilities as frontend developers, there are several aspects of our jobs that will continue to be important in the years to come. In fact, some of them might even become more important than ever.

  • Front of the Frontend — As we saw in the Design2Code research, LLMs are pretty good at producing simple layouts, but we haven’t seen much evidence on how accurately they can translate complex designs. Developers who specialize in CSS, UI development, and animations (among others) will play an important role in both teaching and perfecting AI implementations, and wiring up individual pages into cohesive, well-structured websites.
  • Back of the Frontend — A trend we’ve been seeing over the past few years, even before the rise of AI, is that the back of the frontend is moving increasingly to the server. Modern JavaScript frameworks are, for the most part, server-first nowadays, which means that there’s a big demand for frontend engineers who can do some server-side development as well. Even if you have no intention of becoming a full-stack developer, if you have an inclination for the back of the frontend, my advice is to spend some serious time developing your technical breadth rather than specializing in a particular framework or language.
  • Architecture — Unless we reach the point where AI can handle 100% of the tasks associated with building and maintaining a software application, the codebase will still need to be understandable by humans. And a big part of making a frontend codebase understandable and easy to change is having a solid architecture. I expect that the way we structure applications will evolve as more AI tools start coming into the picture (for instance, by making it easier for an AI to navigate the codebase), but despite these changes, the fundamentals of architecture will continue to matter.
  • APIs and Documentation — These are areas that might even become more important in the coming years. There’s a big gap in the internal documentation of software projects, usually because a lot of it is part of the shared understanding of (human) developers. But AI can’t read our thoughts (yet), so this documentation will need to exist more formally, even if it’s just for a machine to consume. Same with APIs. Where we previously invested in CLIs and other developer tools to make human developers more productive, we might now start to see investments in better APIs and documentation to make AI more efficient and accurate.
A screenshot of Devin pulling up API documentation to complete a task

The Future Future

Now, you might be thinking, “Sure Maxi, but what about the future future? What happens when AI is perfected and doesn’t need any of us anymore?”

That’s a fair point, and it’s a perfectly reasonable thing to be concerned about. But if that future ever happens, I don’t think it’ll happen as soon as some people want us to believe it will.

I like the way Ryan Peterman puts it in his latest newsletter. Inspired by the evolution of self-driving cars, he breaks down the adoption of AI in software development into four stages:

  1. Humans write code and we collect data
  2. AI starts helping with simple and well-defined tasks (e.g. using ChatGPT or Copilot to write a script or autocomplete a function)
  3. AI starts to take on more complex tasks (e.g. researching, planning, debugging, and combining APIs)
  4. AI is “feature complete” and we start working on the long tail of quality improvements until it reaches full autonomy

The rise of Devin puts us in step number 3 of this 4-step process—which is why it might seem that the end is so near. But full autonomy is a really long tail, and human developers will still have a role to play during that entire process.

You’d be right to think that the risks of imperfect AI in software development are not the same as with self-driving cars. But even if no human lives are at stake, businesses would still prefer not to lose millions of dollars or go out of business because Devin made a mistake or can’t solve a production outage.

It’s really anyone’s guess what would happen in the future, but if there’s one thing we know for sure it's that these tools are here to stay, and neglecting them would be a big mistake.

So we shouldn’t think of Devin and other AI coders as our enemies, it's better to think of them as our new best friends. We should spend time with these tools and learn as much as we can from them. In the years to come, learning to be productive with Devin and its friends might play a role in our careers as important as our ability to write code.

ARCHITECTURE SNACKS

Links Worth Checking Out

Just in Time Architecture by Macklin Hartley

đź“• READ

  1. If you’re looking for some non-AI takes on the future of frontend development, this article on the Frontend Mastery blog has some good predictions.
  2. And if you're looking for some additional AI takes, Josh Comeau's blog post on the end of frontend development holds up really well a year after he first published it.
  3. There’s a new performance metric on the block. Interaction to Next Paint (INP) officially became a Core Web Vital this month, replacing good old First Input Delay (FID) (also, RIP).
  4. Josh Collinsworth wrote an insightful article about the devaluation of frontend development, which honestly has always been a problem—but he thinks it might be getting worse.
  5. The fourth edition of one of my favorite JavaScript books, Eloquent JavaScript by Marijn Haverbeke, was just published. And like previous editions, you can read it for free online.

​

🎥 WATCH

​Just in Time Architecture by Macklin Hartley

This is not a frontend-oriented talk at all, but it's a great architecture explainer. If you're looking for a friendly guide to the trade-offs between monoliths and microservices, I recommend you give this one a watch.

​

🎧 LISTEN

​Rethinking frontend architectures with data layers with Max Stoiber

I enjoyed this conversation with Max about GraphQL, data layers, and how his open-source project Fuse.js can help us create type-safe data layers to connect our backend APIs with our frontends.

That’s all for today, friends! Thank you for making it all the way to the end. If you enjoyed the newsletter, it would mean the world to me if you’d share it with your friends and coworkers. (And if you didn't enjoy it, why not share it with an enemy?)

Did someone forward this to you? First of all, tell them how awesome they are, and then consider subscribing to the newsletter to get the next issue right in your inbox.

I read and reply to all of your comments. Feel free to reach out on Twitter or reply to this email directly with any feedback or questions.

Have a great week đź‘‹

– Maxi

Is frontend architecture your cup of tea? 🍵

Level up your skills with Frontend at Scale—your friendly software design and architecture newsletter. Subscribe to get the next issue right in your inbox.

    “Maxi's newsletter is a treasure trove of wisdom for software engineers with a growth mindset. Packed with valuable insights on software design and curated resources that keep you at the forefront of the industry, it's simply a must-read. Elevate your game, one issue at a time.”

    Addy Osmani
    Addy Osmani
    Engineering Lead, Google Chrome