\documentclass[a4paper, 11pt]{article} \usepackage[OT1]{fontenc} \usepackage{url} \usepackage{Sweave} \begin{document} % \VignetteIndexEntry{helloJavaWorld} \title{Hello Java World! A Tutorial for Interfacing to\\ Java Archives inside R Packages.} \author{Tobias Verbeke} \date{2014-09-03} \maketitle \tableofcontents \section{Introduction} This document provides detailed guidance on interfacing R to Java archives inside an R package. The package we will create in this tutorial, the \texttt{helloJavaWorld} package, will invoke a very simple Java class, the \texttt{HelloJavaWorld} class, from inside an R function of the package, the \texttt{helloJavaWorld} function. The objective is to help other people to make available Java algorithms to the R world, be it to compare results, or for their own sake. \section{Structure of the package} We create a folder, \texttt{helloJavaWorld}, which will be the root folder of our \texttt{helloJavaWorld} package. For detailed information on R packages and their structure, the reader is referred to the Writing R Extensions manual. This section only highlights elements that are relevant to our situation. Before detailing the contents of the individual files and folders, we provide a summary overview in Figure \ref{fig:pkgContents}. \begin{figure} \begin{verbatim} helloJavaWorld `- inst `- doc `- helloJavaWorld.Rnw (this document) `- java `- hellojavaworld.jar `- java `- HelloJavaWorld.java `- man `- helloJavaWorld.Rd `- R `- helloJavaWorld.R `- onLoad.R `- DESCRIPTION `- NAMESPACE \end{verbatim} \caption{Overview of package contents for the \texttt{helloJavaWorld} package.} \label{fig:pkgContents} \end{figure} \paragraph{\texttt{DESCRIPTION}} Inside this folder, we create \texttt{DESCRIPTION} file, which can be considered to be the `identity card' of the package. The most important changes when comparing to regular \texttt{DESCRIPTION} files are that \begin{itemize} \item there must be a package dependency on the \texttt{rJava} package (\texttt{Depends} field) as well as \item a system dependency on Java (\texttt{SystemRequirements} field). \end{itemize} \paragraph{\texttt{inst/java}} We create the folder \verb|inst/java| to host our JAR file. This JAR file contains a single \texttt{HelloJavaWorld.class} file, generated from the following \texttt{HelloJavaWorld.java} file. \begin{verbatim} public class HelloJavaWorld { public String sayHello() { String result = new String("Hello Java World!"); return result; } public static void main(String[] args) { } } \end{verbatim} As one can see, this is not the typical example of a Hello World application. The string \texttt{Hello Java World!} is namely not printed to the console, but returned by the \texttt{sayHello} method. The reason to deviate from this tradition is that in practice the interfacing to Java classes will nearly always result in return values being returned to R by the methods of the classes in the JAR file(s). \paragraph{\texttt{java/}} The Writing R Extensions Manual recommends to make Java source files available in a top-level \texttt{java/} directory inside the package. This will ensure source files are not installed, but makes these files available as required when the package is published under an open source license. \paragraph{\texttt{R/}} Two functions are contained in the \texttt{R/} subfolder of the package. The first function is the function that will assure that the Java code is made available. The second function will be the R wrapper to execute the Java HelloWorld class. Namespaces are recommended in R packages, so we will only detail how to include the first function when the package has a namespace. We include a file \texttt{onLoad.R} with the following content~: \begin{verbatim} .onLoad <- function(libname, pkgname) { .jpackage(pkgname, lib.loc = libname) } \end{verbatim} The \texttt{.onLoad} function is a hook function that will be run immediately after loading the package. The function \texttt{.jpackage} that is called inside the \texttt{.onLoad} function takes care to initialize the JVM and to add the \texttt{java/} folder of the package to the class path. If users make use of custom R library paths, they can rely on the \texttt{libname} argument. Note that when building the package the \verb|inst| level is `taken out', so that in the built package the \verb|java/| subfolder will be directly in the root folder. The second function we included in the \texttt{R/} folder is responsible for making the Java class available to the user and is a simple wrapper: \begin{verbatim} helloJavaWorld <- function(){ hjw <- .jnew("HelloJavaWorld") # create instance of HelloJavaWorld class out <- .jcall(hjw, "S", "sayHello") # invoke sayHello method return(out) } \end{verbatim} The \texttt{.j*} functions such as \texttt{.jnew} and \texttt{.jcall} are part of the \texttt{rJava} package and documented in their respective documentation files\footnote{The \texttt{rJava} package can be obtained from CRAN; more information on the package can be found at the package home page \url{http://www.rforge.net/rJava/}}. For convenience, we provide some minimal explanation here as well. The \texttt{.jnew} function creates a new instance of a Java object. The first argument (here \texttt{"HelloJavaWorld"}) indicates the class of the object to be created. If other arguments are needed to construct the Java object, these can be passed as R arguments to the \texttt{.jnew} function as well. In this very simple case, however, no supplementary arguments are required to create a \texttt{"HelloJavaWorld"} object. By assigning the result of \texttt{.jnew} to a variable, we obtain a reference to the object (here \texttt{hjw}) that we can use to further work with the object, or to call Java methods on the newly created object. The \texttt{.jcall} function allows to call a method (\texttt{"sayHello"}) on a Java object. The first argument will be a reference to the object, which we stored in the \texttt{hjw} variable in the previous statement. The second argument contains a character indicating the return type of the method we call. In this example, the return type will be a String, which has been mapped to the \texttt{"S"} character. An overview of all return types is offered in Table \ref{tab:returntypes}. It is important to note that the \texttt{out} object is no longer a reference to the resulting Java object, but has been evaluated directly to a string, i.e. an R character vector of length one. The explanation is that the \texttt{.jcall} method has an argument \texttt{evalString} that is set to \texttt{TRUE} by default. If we would have wanted to obtain the reference to the Java object to further work with it, we would have explicitly set \texttt{evalString} to \texttt{FALSE} as in \begin{verbatim} outRef <- .jcall(hjw, "S", "sayHello", evalString = FALSE) \end{verbatim} To explicitly obtain a string from a Java object reference, we could then use the function{.jstrVal}, as in \begin{verbatim} .jstrVal(outRef) \end{verbatim} The string returned would, of course, be exactly the same \begin{verbatim} "Hello Java World!" \end{verbatim} \begin{table} \begin{tabular}{lp{8cm}} \hline Abbreviation & Type \\ \hline \texttt{"V"} & void \\ \texttt{"I"} & integer \\ \texttt{"D"} & double (numeric) \\ \texttt{"J"} & long \\ \texttt{"F"} & float \\ \texttt{"Z"} & boolean \\ \texttt{"C"} & char (integer)\\ \texttt{"S"} & String \\ \texttt{"B"} & byte (raw)\\ \texttt{"L"} & Java object of class \texttt{} (e.g. \texttt{"Ljava/lang/Object"})\\ \texttt{"["} & Array of objects of type \texttt{} (e.g. \texttt{"[D"} for an array of doubles)\\ \hline \end{tabular} \caption{Overview of the abbreviations that can be used as a shortcut to indicate the return type of a Java method in the \texttt{.jcall} function of the \texttt{rJava} package.} \label{tab:returntypes} \end{table} \paragraph{NAMESPACE} In the R functions of our package, some functions were used from the \texttt{rJava} package. As this package has a namespace as well, it is necessary to import the relevant functions or the whole package into the package namespace to ensure that the \texttt{rJava} package will be loaded as well. Importing the whole package can be done using the \texttt{import} directive. If, on the other hand, we want to make our functions (or other objects) available to the package user, we need to export these objects using the \texttt{export} directive. We will of course not export the \verb|.onLoad| function as it is not meant to be used by the package user. The only function exported, therefore, is the \texttt{helloJavaWorld} function. The \texttt{NAMESPACE} file will thus have the following contents\footnote{For more information on package namespaces, please consult Section 1.6 from the Writing R Extensions manual.}~: \begin{verbatim} import("rJava") export("helloJavaWorld") \end{verbatim} \section{Use of the package} Once the package is checked, built and installed, we can load the package with \begin{verbatim} library(helloJavaWorld) \end{verbatim} and demonstrate it by simply calling \begin{verbatim} > helloJavaWorld() [1] "Hello Java World!" \end{verbatim} \section{Acknowledgements} We would like to thank Simon Urbanek for his continuous efforts on the \texttt{rJava} package, without which it would not have been possible to say Hello from Java to R. The following persons provided feedback and suggestions that helped improve the document: Duncan Murdoch, Prof. Ripley, Simon Urbanek, Hadley Wickham. \end{document}