Developing AI-powered Javadoc IntelliJ plugin

7 min readJust now

Recently, I decided to do a bunch of AI side projects to sharpen my AI knowledge. In this article, I want to go over my latest one, an AI-powered Javadoc IntelliJ plugin.

The idea

The idea behind the plugin is straightforward. You have a function in your Java code; it’s missing Javadoc. If you hit a button (keyboard shortcut), it will add a Javadoc to the function. GenAI generates that Javadoc, so it has meaningful text.

Training the model

I did my research and figured out that I would need to fine-tune one of the already existing models. I narrowed it down to either using CodeBert or CodeT5. In the end, I went with T5 small. But this would work with any other model.

To train the model, I need to have some training data. Google and the open-source community to the rescue, there is a code-search-net dataset already. It’s 2 million comments to code pairs that are scrapped from GitHub. That's exactly what I needed.

I am doing all of this on my MacBook Air M1, so I do not have a GPU. Training even the small variant of the model with this dataset would take over 80 hours. So, I started looking at how to run this on GPU somewhere. I came across Google Colab, where you can run on GPU (somewhat limited) for free. Or you can buy some credits for 10 euros.

So I have my model, training data, and environment. Let’s start coding.

We will start with some imports:

from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, Trainer, TrainingArguments
import torch
import wandb
from google.colab import drive

And some light setup. Here we log in to Wandb and Google Drive. We can you wandb to monitor some metrics during training and drive for storing our model and training checkpoints.

drive.mount('/content/drive')
wandb.login(key="YOUR_FREE_ACCOUNT_API_KEY")
wandb.init()

Prepare the training data. We load the code_search_net data set from hugging face and tokenize it.

# Load the CodeSearchNet dataset for Java
dataset = load_dataset("code_search_net", "java")

# Display an example
print(dataset['train'][0])

def tokenize_function(examples):
    # Tokenize code as input
    inputs = tokenizer(examples["func_code_string"], padding="max_length", truncation=True, max_length=128)
    # Tokenize docstring as target
    targets = tokenizer(examples["func_documentation_string"], padding="max_length", truncation=True, max_length=128)

    # Set input_ids and target_ids
    inputs["labels"] = targets["input_ids"]
    return inputs

tokenized_datasets = dataset.map(tokenize_function, batched=True)

And train the model. In the example below I have it set up so that it stores checkpoints during the training. If for some reason your training gets interrupted, when you run it again it will resume where it left off. The final trained model will be again stored on Google Drive.

model_name = "Salesforce/codet5-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

training_args = TrainingArguments(
    eval_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=20,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir="./logs",
    save_total_limit=2,
    output_dir = "/content/drive/MyDrive/model_checkpoints"
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
)
trainer.train(resume_from_checkpoint=True)

model.save_pretrained("/content/drive/MyDrive/final_model")
tokenizer.save_pretrained("/content/drive/MyDrive/final_model")

API development and deployment

Great now we have our model. Let’s do a quick check it does what we need.

def generate_javadoc(code_snippet):
    # Tokenize the code snippet as input
    inputs = tokenizer(code_snippet, return_tensors="pt", max_length=128, truncation=True)

    # Generate summary IDs with the fine-tuned model
    summary_ids = model.generate(
        inputs["input_ids"],
        max_length=128,
        min_length=30,
        length_penalty=2.0,
        num_beams=4,
        early_stopping=True
    )

    # Decode the generated Javadoc comment
    javadoc = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return javadoc

code_example = """
    public void storeRecord(Optional<Metadata> metadata) throws BadDataExpection {
        fillMissingData(metadata);
        metadataRepository.store(metadata);
    }
"""

# Generate Javadoc
print("Generated Javadoc:")
print(generate_javadoc(code_example))

Success. It printed the desired Javadoc. Now we need to store that model to hugging face so that we can retrieve it anywhere. To do so, you log in to hugging face and click “create new”. It will create a git repo for you, you just need to push all the files from the final_model folder into that git repo. As easy as it gets.

Next. We need to create a simple python app, that would take the code that we want to generate the Javadoc for as input and returns the generated Javadoc. I used FastAPI library for that.

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

app = FastAPI()

model_path = "/app/model" 
model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

class CodeInput(BaseModel):
    code: str

@app.get("/")
async def health_check():
    return {"status": "running"}

@app.post("/generate_javadoc")
async def generate_javadoc(code_input: CodeInput):
    try:
        inputs = tokenizer(code_input.code, return_tensors="pt")
        outputs = model.generate(**inputs)
        javadoc = tokenizer.decode(outputs[0], skip_special_tokens=True)
        return {"javadoc": javadoc}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

You can see it’s loading the model from /app/model this is the folder where I store the model. It’s very simple and elegant.

To deploy this, I dockerized this app with this simple dockerfile .

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

RUN pip install huggingface-hub transformers

RUN python -c "from transformers import AutoModelForSeq2SeqLM, AutoTokenizer; \
    model = AutoModelForSeq2SeqLM.from_pretrained('PavelPolivka/javadocer-small'); \
    tokenizer = AutoTokenizer.from_pretrained('PavelPolivka/javadocer-small'); \
    model.save_pretrained('/app/model'); \
    tokenizer.save_pretrained('/app/model')"

# Copy the rest of the application code into the container
COPY . .

# Expose the FastAPI app's port
EXPOSE 8000

# Run the FastAPI app
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Here, you can see that I downloaded my model from hugging face during the docker build. It ensures that the container can spin very fast if need be.

To deploy this, I used SlipLane. Again, it is as easy as giving it access to the GitHub repo, where you have that simple Python app and the dockerfile.

To this this new API you can call it with curl like so:

curl -X POST "https://sliplane.instance.com/generate_javadoc" -H "Content-Type: application/json" -d '{"code": "public int add(int a, int b) { return a + b; }"}'

You can see the whole code on my GitHub.

Plugin development

Now we have all the building blocks we need to create that IntelliJ plugin.

Create an empty Gradle Java project. I used Kotlin pipelines, so the examples will be with Kotlin pipelines.

Your build file will look something like this.

plugins {
    id("java")
    id("org.jetbrains.intellij") version "1.17.4"
}

group = "com.ppolivka"
version = "1.0.1"

repositories {
    mavenCentral()
}

java {
    sourceCompatibility = JavaVersion.VERSION_17
}

intellij {
    version.set("2023.3.7")
    plugins.set(listOf("com.intellij.java"))
}

dependencies {
    implementation("org.json:json:20240303")
}

tasks {
    buildSearchableOptions {
        enabled = false
    }

    patchPluginXml {
        version.set("${project.version}")
        sinceBuild.set("233")
        untilBuild.set("242.*")
    }
}

Gradle will take care of everything and prepare the project for you.

Now you need to create plugin.xml file and GenerateJavadocAction class. Your file structure should look something like this.

In the plugin.xml we will define our new action, that we depend on Java, who we are, what is the name of the plugin.

<idea-plugin>
    <id>com.ppolivka.javadocer</id>
    <name>AI Auto Javadoc Generator</name>
    <description>Generates Javadoc for methods using an AI-powered API.</description>
    <vendor email="polivka.pavel@gmail.com" url="https://ppolivka.com">Pavel Polivka</vendor>

    <depends>com.intellij.modules.platform</depends>
    <depends>com.intellij.modules.java</depends>

    <actions>
        <action id="GenerateJavadocAction" class="com.ppolivka.javadocer.GenerateJavadocAction"
                text="Generate Javadoc" description="Generates Javadoc for the selected method">
            <add-to-group group-id="GenerateGroup" anchor="last"/>
            <keyboard-shortcut keymap="$default" first-keystroke="ctrl alt shift J"/>
        </action>
    </actions>
</idea-plugin>

In the action Java class we will do the actual Javadoc generation and placement.

Let’s define what will happen when the action is triggered.

@Override
public void actionPerformed(AnActionEvent event) {
    Project project = event.getProject();
    Editor editor = event.getData(com.intellij.openapi.actionSystem.CommonDataKeys.EDITOR);
    PsiFile psiFile = event.getData(com.intellij.openapi.actionSystem.CommonDataKeys.PSI_FILE);

    if (project == null || editor == null || psiFile == null) {
        return;
    }

    PsiElement elementAtCaret = psiFile.findElementAt(editor.getCaretModel().getOffset());
    PsiMethod method = PsiTreeUtil.getParentOfType(elementAtCaret, PsiMethod.class);

    if (method == null) {
        JOptionPane.showMessageDialog(null, "No method found at cursor position.");
        return;
    }

    String javadoc = generateJavadocFromApi(method);
    if (javadoc != null) {
        insertJavadoc(project, psiFile, method, javadoc);
    }
}

We are looking where the caret is in the editor. Trying to see if its inside a method. If so we will take that method and call our API for it.

private String generateJavadocFromApi(PsiMethod method) {
    try {
        String apiUrl = "https://sliplane.instance.com/generate_javadoc";
        URL url = new URL(apiUrl);
        HttpURLConnection connection = (HttpURLConnection) url.openConnection();
        connection.setRequestMethod("POST");
        connection.setRequestProperty("Content-Type", "application/json; utf-8");
        connection.setDoOutput(true);

        // Create JSON payload using JSONObject for valid JSON encoding
        JSONObject payload = new JSONObject();
        payload.put("code", method.getText());

        try (OutputStream os = connection.getOutputStream()) {
            byte[] input = payload.toString().getBytes("utf-8");
            os.write(input, 0, input.length);
        }

        // Read API response
        BufferedReader br = new BufferedReader(new InputStreamReader(connection.getInputStream(), "utf-8"));
        StringBuilder response = new StringBuilder();
        String responseLine;
        while ((responseLine = br.readLine()) != null) {
            response.append(responseLine.trim());
        }

        // Extract Javadoc comment from JSON response (assuming API returns {"javadoc": "..."})
        String jsonResponse = response.toString();
        JSONObject jsonResponseObject = new JSONObject(jsonResponse);
        return jsonResponseObject.getString("javadoc");

    } catch (Exception e) {
        e.printStackTrace();
        JOptionPane.showMessageDialog(null, "Error calling API: " + e.getMessage());
    }
    return null;
}

Then we take that generated Javadoc and place it into the editor.

private void insertJavadoc(Project project, PsiFile psiFile, PsiMethod method, String javadoc) {
    ApplicationManager.getApplication().invokeLater(() ->
            WriteCommandAction.runWriteCommandAction(project, () -> {
                // Format the Javadoc to ensure correct multiline structure
                StringBuilder formattedJavadoc = new StringBuilder("/**\n");
                for (String line : javadoc.split("\n")) {
                    formattedJavadoc.append(" * ").append(line.trim()).append("\n");
                }
                formattedJavadoc.append(" */");

                // Create the Javadoc comment element
                PsiElement comment = JavaPsiFacade.getElementFactory(project)
                        .createCommentFromText(formattedJavadoc.toString(), null);

                // Insert the formatted Javadoc before the method's parent to ensure proper placement
                PsiElement addedComment = method.getParent().addBefore(comment, method);

                // Commit the document to ensure it reflects the changes
                PsiDocumentManager.getInstance(project).doPostponedOperationsAndUnblockDocument(
                        PsiDocumentManager.getInstance(project).getDocument(psiFile)
                );
                PsiDocumentManager.getInstance(project).commitDocument(
                        PsiDocumentManager.getInstance(project).getDocument(psiFile)
                );
            })
    );
}

And that’s it. To test it we can call ./gradlew runIde it will open a new IntelliJ instance where the plugin is already “installed”. To build it you can call ./gradlew buildPlugin . It will build a .zip file that you can upload to the JetBrains Store.

You can view the full source code for this on my GitHub.

And you can check out the plugin on JetBrains Marketplace.

Summary

I learned a lot during this whole project. I feel like I grasped the basics of AI development, and got more comfortable with Python. This whole plugin is not “production ready”, the model I trained is based on the T5 small so it’s not as smart as it could be but did not want to spend a huge amount of time on this, this was more or less a learning exercise.

Hope this article helps you to understand that AI development does not need to be hard or complicated. To get speak peaks from my next side project you can follow me on X.