Transcript#

This transcript was generated automatically and may contain errors.

Hadley, over to you. Thanks Ella. Let's go.

Okay, so thanks for having me and today I want to talk about using Claude Code for R. And I'm mostly going to be focusing on programming, not data analysis.

Maybe you can ask me a bit about Claude Code for data analysis at the end if you want, but I want to focus on using it for software engineering because that's what I've been doing a bunch of lately.

So to give you a little bit of a sense of what I've been using Claude Code for, I have been doing like a little bit of vibe coding. So I made my first Swift app so I could do a talk timer on my iPhone, which I guess I should be using today, but I've got. I also made a live polling app in Go and I also tried making an R package for generating slide images, which you'll see those images a little bit later.

But I don't really want to talk about vibe coding today because I think you hear a ton about that already. What I want to talk about is just using it for coding because Claude Code has very quickly become an essential part of my software engineering toolkit. So most recently I've been using dbplyr and testthat and both of these are like big important packages where they have to be right.

So dbplyr translates to SQL. It's eight years old. It's heavily used in enterprise workflows. testthat, use for unit testing and R. It's 16 years old and it's used by I think now over 10,000 R packages. So these are places like these are longstanding established R packages used by tons of people where it's important to get it right.

And so Claude Code has now become like I think an indispensable part of my software engineering toolkit. So I'm going to start with a little bit of sort of provocation and then I'm going to perhaps ill-advisedly have a go at doing some live Claude Code demos for you.

Claude Code as important as Git

And so somewhat provocatively I'm going to start with a quote, a quote from me. And that is it's really starting to feel to me like Claude Code is just as important as Git for high quality software engineering.

And that is it's really starting to feel to me like Claude Code is just as important as Git for high quality software engineering.

And I think this is a pretty provocative and maybe provoking statement, but I have been really impressed with the last round of models, particularly Opus 4.5 and 4.6 and their ability to write R code.

Kind of in the small, not necessarily, they're not great at assembling like large chunks of code. They're not great at organizing, you know, thinking really big and figuring out what problems you want to solve. But they're certainly like often better than me at writing like functions that are correct.

And I should also say like I'm going to be talking about Claude Code today. That's what I use. I think the other tools like Cursor and Codex and Gemini, they're all kind of much of a muchness. But my sense is certainly for R code, Claude has the lead right now.

And if this statement is a little scary to you, you know, I think that's totally reasonable. But to me, it feels like software engineering has like changed irrevocably. The genie is out of the bottle and there's no way for us to stuff this back in. I certainly have like deeply ambivalent feelings about many of the consequences of generative AI and LLMs, but there's no going back. We have to like embrace this. We have to try and grapple with what this means.

But I think the good news is like this has not taken any of the joy of programming away from me. Like I'm still, I still love my job. I still love engaging code. It's just the way that I engage with code has changed a little bit. Like no longer is it me like directly typing the code into the computer. It's me kind of like mediating my thoughts via Claude, who's then going to write a lot of the code.

Quality over quantity

And I think one of the things that's really interesting to me is that a lot of the best practices for humans actually turn out to be best practices for agents as well. And kind of weirdly and unfortunately, it feels like we're willing to do these things for agents that we've never been willing to do or have been less willing to do for our colleagues in the past.

But this is just things like having meaningful file names is just as important today. Like Claude, when it's looking for files, it's going to take a stab at what you might name the file and glob through a bunch of directories. So just as useful for your AI assistant as for your human colleagues. Similarly, using consistent prefixes and suffixes for function names, file names, in the same way that these help situate humans, they also help situate agents.

Similarly, using comments to capture the why are really important. LLMs, I think by default, still tend to write a lot of unnecessary comments. Comments that just kind of explain what they're doing, which are not so useful. But you can certainly prompt them to do better and you yourself can continue to write comments. And I think one of the things that I find a little interesting and surprising is like, I don't know, it feels like agents are better at reading the comments than I am. They will actually read stuff that previous me has written and take that into account, where it's very easy for me when I'm working on something to skip over.

Another thing that turns out to be even more important for agents than for humans is like having high test coverage. Write unit tests and run them. And we'll talk more about this a little bit later, but I think one of the things that makes Claude Code profoundly different to just using a chat on the web is this feedback loop, that it can write code and it can write tests and it can run the test and see if they're returning the right thing. And while there's certainly lots of potential failure modes there, that kind of idea of having the implementation of a code and expectations of the code written semi-independently, that really helps improve the quality of the code and makes it less likely that the agent is going to make a dumb mistake.

Another thing that I've found really useful is aggressively refactoring code bases, refactorings that probably never would have reached the threshold of being worthwhile to do previously, because the cost of writing code has dropped so dramatically with Claude Code that tackling these things is now much, much, much more worthwhile than it was before. And that helps you, obviously, not just humans, but also helps the agents that are going to be looking at the code.

And similarly, enforcing style guides, telling the agent what to do, telling your human colleagues what to do, super, super useful. Because at the end of the day, I think it's still your code, regardless of whether you have produced it by hand typing it or you've used your robotic amanuensis to take your thoughts and turn them into code, you are still responsible for it. You have to own your work. Because if it goes wrong, you can't blame Claude. It's your responsibility at the end of the day.

And I think it's just as important as it ever was to continue to sweat the details. It's always been possible to half-ass writing code. It's easier than ever now. You can half-ass writing Claude Code. But you can also put the time in and the effort in and really sweat the details and get something fantastic out of it.

Because at the end of the day, I think a coding agent can take you where you want to go. And it can often do it more quickly than you might have been able to do so yourself. But it can't pick the destination. You are still in the driver's seat, although not in this image. You are still directing the process. It's your responsibility to figure out what is actually important. What are you trying to do?

Because at the end of the day, I think a coding agent can take you where you want to go. And it can often do it more quickly than you might have been able to do so yourself. But it can't pick the destination.

Ways to interact with LLMs

So that's kind of the provocation. I think Claude Code is really important. Maybe a little provocatively, but I think it's probably as important to get to me today. And we all have to grapple. There are a lot of big feelings around this. We have to be grappling with this because it can make you so much more productive. And not just producing more code, but producing better code. So now I want to show you a little bit about what that looks like.

So I think there are really four ways to interact with LLMs right now. The first is, and the place that I think pretty much everyone starts, is using a chatbot on the web. I think everyone's pretty familiar with this. They can be great for writing R functions. They can be great for all sorts of questions, I'm sure. Many of you understand. Many people are using these every day for all sorts of things, not just R code.

And from there, you can kind of go in two directions. The one direction, which I'm not going to spend any time on today, is you can start writing code to use the chatbot. So you could use a package like Elmer, for example, if you're an R user. And you can use that for all sorts of data science tasks. And in particular, I think one of the data science superpowers you get with Elmer is, and LLMs, is the ability to turn unstructured data, whether that's text, or images, or audio, or video, kind of all of these types of data that as data scientists were pretty hard to wrestle with previously, you can use an LLM. You can use structured data tools to turn these into nice, tidy, rectangular data sets.

The other direction I think you can go to is, well, if you start doing a bunch of copying and pasting, well, if you're writing a bunch of code in Claude or ChatGPT, you're going to start to do a bunch of copy and pasting. And to me, that's kind of feels like the bad old days, like before Sweave, before R Markdown, before Quarto, where you'd be doing something in R, like you produce a nice plot, and you copy and paste it, and stick it into your Word document. And that's fine as far as it goes. But of course, your data changes. You made a mistake in the code. Now you've got to re-copy and paste. And that turns out that repeated copying and pasting from one program to another is just a bad workflow. It's frustrating. It's prone. It's full of inaccuracies.

So the next place you might go is instead of using an external chatbot, you might embed the chatbot inside of wherever you're writing code. And so you might have heard we've been recently working on this tool called Posit Assistant for RStudio. And we have Positron Assistant for Positron, soon to be unified. So we'll have Posit Assistant everywhere. But two big advantages here. First of all, you're no longer copying and pasting. The agents can now edit your scripts directly. And the other big advantage is they have so much more context about what you can do. They know what packages you have loaded. They know what variables are going on in your R session. Because they have all of that additional context, they can do a much, much better job.

And then a step further away from that, the next step is moving towards these coding agents, where now you mostly just interact with this text box in a terminal, by and large. And it's going to go away into a bunch of autonomous stuff.

What makes a coding agent

It's a little bit hard to explain this. I'm going to take my best stab. But we're going to switch to a demo shortly. So if this doesn't make sense, don't worry too much. Hopefully, you'll see more about how this feels in practice. Because it doesn't feel like this is so different in terms of the components. But just the feel of using it is quite, quite different.

So what makes a coding agent? Well, first of all, the least important difference is you're typically in a terminal user interface rather than a GUI. I think this is mostly a cosmetic difference. I think we're seeing these kind of terminal user interfaces, I don't know, basically. Because I think every software engineer feels a little bit threatened by LLMs. And so we're kind of going back a little bit in this kind of nostalgic era of the 80s and 90s and interacting with the computer via terminal.

It's also just a big old custom system prompt with some advice about writing code. Again, not that useful and not that interesting. A bigger difference is it comes with a bunch of built-in tools. Tools for doing things like searching for files in the current directory or reading parts of a file or entire file. And then also tools for changing those files, like creating new files or editing existing files. And then the kind of a piece that brings us all together is that it can also run tools on the command line. It can basically run any tool on your computer. And that means it can run R code. And that gives Claude Code this feedback loop. It's no longer just blindly creating something, giving it to you, the human, to run and give it feedback. It's going to autonomously run code, look at the results, and then make decisions based on that.

And I kind of pulled this slide from a very old workshop I gave about writing, developing code, developing packages in R using dev tools. But this is what you're trying to do as a human, right? You want to build up a good feedback loop that no one reliably writes code correctly first time. I mean, maybe some people do. I certainly don't. But so you need some feedback loop. You need to be able to write code and then test it to make sure it's correct. And this role of unit testing is still incredibly, incredibly important, more important today than it's ever been, because it gives you this kind of double entry bookkeeping system where you've kind of declared what you want in two different places in two slightly different ways. And if those two agree, then the chances are that the code is correct. Of course, it's still possible to make the same mistake in multiple places, but much, much, much less likely to make the same mistake in two places in exactly the same.

And so this is what code gives you, particularly when you couple it with some information about how to run R code. So when you're using Claude Code, you're going to have a CLAUDE.md. That is what we give instructions that needs to know to use your tool, understand your tool. And our kind of standard Claude Code set up has all of this information about, well, how do you run R code? Well, you use the R script command line, you call devtools and you run that code, you run the tests, you do this. If you want to document the package, you do that. If you want to check the package, you do this. And so this gives for the ability to not just do stuff, but to kind of tell if it's done the right thing.

Live demo: fixing a Roxygen2 issue

So let's dive in. So I'm here in the Roxygen package and I have found three issues that we're going to have a go at for Claude Code solving. So we're going to kind of ramp up in terms of difficulty, but this first one, let's take a look at the issue. Parsing Nimble code gives an unclosed curly bracket.

And I looked at this issue earlier and I immediately popped into my head, I better know what's going on. And that is when you document something that's not a function with Roxygen2, it automatically calls class on it. So it can generate something.

So this is an example of documenting a dataset. And whenever you document something that's not a function, it ends up in this kind of dataset path somewhat weirdly. And so if you document this specific type of object produced by the Nimble package, you're going to get an error. And I'm pretty sure it's because it's an object class. And this reminds me of this very weird thing that if you quote a single back tick, it has this very weird class. So I kind of knew that from my background knowledge. I put that kind of hint in here.

I'm recording this knowledge in the issue. I think that's good practice. Whenever I'm working on a more complicated issue, I try to record that knowledge in the issue. That's good for me coming back. And hopefully the Claude Code will pick up on this. So I'm going to open up a terminal.

And I'm going to launch Claude Code. So I'm just running this inside a terminal inside Positron, which is kind of crazy. And then I'm going to tell it what to do. And one super kind of interesting technique is I can say, okay, just fix this issue.

And it's going to think a little bit. And so one of the things that says in our Claude Code is how to find out more information about the issue. So it's going to use the GH tool. It's going to download that issue and it's going to read all the comments. So it's doing much like you would do as a human, right? If I told you to fix this issue, well, the first thing you're going to do is go and read it.

Okay. So it did that. It failed. Maybe it listened to me. It's finally figured out, okay, I'm going to do this. So one of the things that Claude Code does, it's not particularly brilliant. What it is, is it's like dogged. It is just going to grind away again and again and again. And it's going to make a ton of dumb mistakes. Often it's sometimes it corrects itself. Sometimes it doesn't, but this thing has grit. It's just going to grind away until it solves this problem.

So it's doing a few little experiments, like, okay, I'm going to see what this is. Just doing some investigation. Okay. It's doing a little experiment where it's going to try and document it.

So this is the type of thing where like, if I wasn't in a talk, you could just let it like grind away for, okay. So it's going to make a whole entire package somewhere. And it's just checking with me that it's okay to run it. It's going to document it. It's fine.

Okay. So it's finally, it's created like a reproducible example, right? Again, it's working much like you might tackle this as a human. Like the first thing it does is create a reproducible example. And the kind of thing that's like cool about the reproducible example is it actually went away and created an entire like tiny little package, which is something like, you know, maybe I would eventually do, but it's kind of annoying. It would take a little while.

Oh, this is interesting. Okay. So it's decided to fix it. Figured out what the problem is. It decided to call it the escape ID braces function. And then it's like, oh, actually, well, I need to create that function too. So it goes away and does that.

So now it's going to add a test. So that's good. And it's going to figure it out. Actually, it doesn't have to recreate the entire thing. The entire document is now this calls the format helper that it can actually use.

And at this point, I'm just going to open my Git pane and just see what it's actually done. That news bullet. If you look at my CLAUDE.md, like there's a, so then this is all from, we have a helper in the development version of usethis called use Claude Code that will create this, but this just kind of, this just accumulates all of the little things it's done wrong when it adds news bullets in the past. So I've explicitly told it like, okay, put the function name first and reference the issue number.

And what's this changed here? Okay. It's added a test. And that looks correct.

And, you know, if various things I'm missing or out of order, then I'm just like, okay, go, go for it. And it will normally figure it out.

Okay. So now I've done the specs. And so I don't yet trust Claude Code to do any of the Git stuff for me. So now I'm just going to make this, I'll call it escape braces.

And now humans are going to read this. So I have to type carefully.

And now I've already forgotten what issue that was. Fixes 1744. I'll commit that and push that.

So what I'll typically do with a pull request like this, or in fact, any pull request I do that isn't going to be reviewed by another separate human, is I'll just like let this sit, like ideally for at least 24 hours. So I've kind of like forgotten what I did, because that's about as long as my memory lasts. And then I'll come back and kind of look at it with fresh eyes. And then I can be like, okay, well, you know, this looks good or this doesn't.

Live demo: planning a larger refactor

A problem solved. So let's now tackle a slightly, how are we going for time?

Okay. I think we'll go. So let's, I'm going to skip this. Let's skip this one. It's a little bit more complicated and jump to this bigger issue, which is to move to a 4.0 dependency. So let's switch back to Positron.

I'm going to close this PR. I'm going to switch back to Claude Code. And one thing I'm going to do here is I'm going to clear it. And this is basically just, you don't have to do this, but I'm basically saying like, I'm tackling a new problem now. So just ignore everything we've talked about. We're going to tackle something new.

And what I want to tackle in this one is currently Roxygen2 depends on R3.6. But according to the tidyverse backward compatibility rules, we can now move to R4.0. And so this is kind of a bigger job. So I'm going to say, come up with a plan for upgrading Roxygen2 to, from R3.6 to 4.0. Think about pipes, inline lambdas, and raw strings.

So rather than, so the big thing, the big difference here is I'm asking you to come up with a plan. So I'm not going to say, if it's a simple thing, I'll just say, go and do it. But for bigger tasks, I'll say, hey, can you come up with a plan first?

And so let's see what it's going to do. Okay. Well, first it's going to research. Okay. It's going to actually do some Googling to figure out what are new features in R4.0. Okay. That's kind of cool. So you can see it's doing a couple of things at the same time. It's exploring the code, and it's exploring what are R4.0 features, and it's looking for that issue I pointed to.

So, oh gosh, now it's reading the news. Good luck with that.

Well, we'll see how it goes. So again, Claude Code, not always, like obviously does, is not as intimately familiar with what we might do as maybe I am on a good day, but it's going to go off and do a bunch of research. It's just going to churn away, like reading a lot of webpages, which also reminds me, like, this is one of the things that I keep meaning to do, which would be useful for humans too. Like, because in the tidyverse, we are like five years behind the latest in R. I keep meaning to make us write a blog post. Like, we should write a blog post. Like, hey, we can finally use R4.0. Like, what are all the cool new features in R4.0?

Okay. So it's come up with a plan. So let's read the plan. It's going to bump the R version. Okay. That's a good start. Okay. Forgotten about strings as factors went away in R4.0. So it can get rid of all of these. It can also convert single line anonymous functions to use this new syntax. And it can use raw strings instead of these, like, really complicated, let's see if we can find an example.

So this is like really good because I can inspect its plan and I can be like, oh, okay. Oh, you forgot about this. Um, okay, so that refined me.

So I'm kind of like it's doing stuff I'm giving it feedback, uh, it kind of will hopefully integrate that nicely but the big idea here is like the complicated stuff really really worthwhile to do a plan. Because then you can inspect it look for any big problems and then you can say go, okay, go go off and do that, which uh, we won't have time for today.

Closing thoughts

So we'll leave it there. It's come up with a good plan and I'm gonna quickly so that's to conclude it.

So what I wanted to um, you know really emphasize is that I think what you see so much of Claude Code like in the media with all of the sort of vibe coding and you know, like driving through the desert as fast as you possibly can, you know like driving through the desert as fast as you possibly can, running over anything that might be in your way. Absolutely, you can do that with Claude Code, but there is I think an alternative pathway you can take which is really thinking about quality. And I would you know strongly encourage you to adopt that mindset when you look at Claude Code.

So, you know the the genie is out of the bottle I think software engineering has changed irrevocably. Regardless of how you feel about that. You've kind of got to figure out ways to cope with it. And but at least in my experience you can still have a huge amount of fun programming and writing R code.

And you just have to change your mindset that you're maybe no longer directly typing R code you're talking with Claude. It is still producing the code for you. And you but you still absolutely get out what you put in like you can create sloppy vibe coded garbage. Sure, go ahead and do that if you want, but you can also use these tools to produce you know really high quality pieces of art. It's not going to like save you that much time compared to what you would have done previously. But I think you can go further you can tackle problems in new domains. And you can still continue to produce really high quality code.

Thank you.

Q&A

Oh, thank you so much Hadley that was so fascinating to hear your thoughts on that but also to see in practice how it all works and how you use it. Uh, we've got about five to seven minutes for for a Q&A we've had 10 in the chat. And we've been asking people to upvote them because I don't think we've got time to go through all of them. Um, but the the top question that you're getting is if AI generates code replaces our need to type code, what would you say are the key skills for a data scientist now? And I'm going to extend that slightly to data scientists and software engineers.

Yeah, uh, yeah. I mean the the one thing that's obvious to me is like, like maybe like I think you have the the relative value of writing code has decreased. But the relative value of reading code has increased because you still need to like if you don't know what that code is doing you're in a very very dangerous place. So you still like all of those skills around reading and understanding and critiquing and thinking about code are super super critical. I think that's one place where it's clear like you still want to have a human loop. You still want to be you know, checking what's going on.

The other thing which is kind of harder to illustrate in a little snippet like this is they just don't have a good sense of like the big picture stuff like obviously they don't know what you should be doing. They don't know like for data analysis. They're also like not good.

But I should say the other big difference with data analysis between data analysis and software engineering is software engineering you have unit tests. Like, you know what the correct answer is and you can enforce that. When you're doing data analysis when you're doing data science, you do not know what the correct answer is. And it's bringing this like combination of like skepticism and curiosity to it as a human. But that feedback loop is missing and I think it makes it much much harder for it's going to take longer for AI to really transform data science in the same way it has software engineering.

But certainly like the cost of making like a simple Shiny app to interactively explore some part of your data, that has dropped from you know, being a multi-hour investment to be maybe, you know, two minutes of work. And so it's really changed. I think it's really changing the trade-offs. So I I don't know how that's going to play out in the long run like like no one knows that. But I think you know pushing for higher quality by like doing more by instead of having to struggle to write that Shiny app whip up a quick prototype. 100% fine to vibe code that you're the only person that's going to be using it. You're going to use it for 15 minutes and throw it away. Like as the cost of code decreases, the value of these like throwaway apps increases. And it gives you that ability to kind of like really narrow in like an interesting bit of data and explore that.

Thank you. Yeah, it's so interesting what you say about like the reading and the writing and how those how that balance shifts.

I guess kind of a follow-up question to that is our is our next most upvoted question which is that there was a randomized trial by actually by a pro LLM organization that found that LLM assisted bug fixes actually end up taking approximately 20% more developer time while the developer thought they were saving time. And whilst as if says he's not an active LLM user your example kind of resonates for me and this question popped up on the on the curly brace example. Would you have fixed this example braces bug faster on your own? Like what's been your experience?

That's yeah, that's an interesting one because I think that's a class of problems of like we knew like I know like I'm, like I know there's like one line of code that I need to change to fix that which is basically what the LLM did. But for me to like remember where that line of code is, I mean, maybe I would have been able to track it down, but I don't think it would have been that much faster. And the kind of advantage of like, you know, I would have if I was doing this in real life I would have like just let it rip while I was like triaging other bugs. So I would be able to do something at the same time that I was tracking down that problem.

So I I 100% agree that it is very easy to pull yourself that to make yourself feel like you're moving faster with an LLM where you're not actually moving any faster than you were before it just like feels faster or more fun or different in some ways and you have to be alert for that. But I, I think a lot. Yeah, so you have to balance that but certainly for me on the whole like it's a big I think it's a big performance boost for me. That either does make me a lot faster, maybe like twice as fast when I'm tackling certain a certain class of problems and that is like a pretty amazing performance improvement for me.

So I I 100% agree that it is very easy to pull yourself that to make yourself feel like you're moving faster with an LLM where you're not actually moving any faster than you were before it just like feels faster or more fun or different in some ways and you have to be alert for that.

Well when we think about how much you got done without relying on Claude Code, um, your extraordinarily prolific output, this is an exciting time for for R users and developers if you're if you're going to be doubling your output.

Um, so so that's actually counteracted by the fact that I have more meetings now so I kind of feel like me plus Claude Code is productive me plus Claude Code plus meetings is equivalent to me without Claude Code without meetings. So that's I I do enjoy that.

Okay, fair enough. Fair enough. Well Hadley, there are lots of other questions and unfortunately, we're not going to have time to to go through them all but if you are, I'm not sure if you if you join the Discord server, but if you're happy to do that and and chat a little bit more if your people would appreciate that. But um, otherwise, thank you so much for for coming along to be our keynote today and your support of the conference in general. We really really appreciate you and all the work that you do for the R community.